Elasticsearch-based distributed file search across all cluster nodes. Use when searching for files, finding duplicates, or querying storage metadata.
Elasticsearch + FSCrawler deployment for searching files across the entire Proxmox cluster.
┌──────────────────────┐
│ Elasticsearch │
│ 192.168.1.122:9200 │
│ (CT501 Giratina) │
└──────────┬───────────┘
┌───────────────────┼───────────────────┐
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ Giratina │ │ Talon │ │ Victini │
│ 1 Crawler │ │ 3 Crawlers │ │ 3 Crawlers │
│ RAID6 │ │ 5.5TB │ │ 29TB │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
┌──────▼──────┐ ┌──────▼──────┐
│ Hoopa │ │ Silvally │
│ 1 Crawler │ │ 1 Crawler │
└─────────────┘ └─────────────┘
Elasticsearch: http://192.168.1.122:9200 Total Storage Indexed: ~18.5TB Total Documents: 3.4M+ files Active Crawlers: 9
curl -s "http://192.168.1.122:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {"match": {"file.filename": "document.pdf"}},
"size": 20
}'
curl -s "http://192.168.1.122:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {"wildcard": {"path.real": "*Legal*"}}
}'
curl -s "http://192.168.1.122:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {"wildcard": {"file.filename": "*.pdf"}}
}'
curl -s "http://192.168.1.122:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {"range": {"file.filesize": {"gte": 1073741824}}},
"sort": [{"file.filesize": {"order": "desc"}}]
}'
curl -s "http://192.168.1.122:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {"range": {"file.last_modified": {"gte": "now-7d"}}},
"sort": [{"file.last_modified": {"order": "desc"}}]
}'
curl -s "http://192.168.1.122:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{
"size": 0,
"aggs": {
"duplicate_sizes": {
"terms": {"field": "file.filesize", "min_doc_count": 2, "size": 100}
}
}
}'
curl -s "http://192.168.1.122:9200/*-storage/_search?pretty" -H 'Content-Type: application/json' -d'
{
"query": {"match": {"file.filename": "your-search-term"}}
}'
| Crawler | Path | Index | Documents |
|---|---|---|---|
| raid6-storage | /mnt/raid6 | raid6-storage | ~265 files |
| Crawler | Path | Capacity | Documents |
|---|---|---|---|
| talon-nvme-storage | /mnt/nvme-storage | 931GB (88%) | 218K+ |
| talon-pmc-data | /mnt/pmc_data | 1.9TB (86%) | 576K+ |
| talon-t9 | /mnt/t9 | 3.7TB (100%) | 2.3M+ |
| Crawler | Path | Capacity | Documents |
|---|---|---|---|
| victini-storage | /mnt/storage | 22TB (8.2TB used) | 253K+ |
| victini-ext4-drive | /mnt/storage/ext4_drive | 3.6TB (2.3TB) | Growing |
| victini-new-volume | /mnt/storage/new_volume | 3.7TB (2.4TB) | Growing |
| Crawler | Path | Capacity | Documents |
|---|---|---|---|
| hoopa-storage | /mnt/network_transfer | 393GB (90GB) | 750+ |
| Crawler | Path | Capacity | Documents |
|---|---|---|---|
| silvally-storage | /mnt/raid6 | 832GB | 3 folders |
curl "http://192.168.1.122:9200/_cat/indices?v"
curl "http://192.168.1.122:9200/_cat/indices/*-storage*?v&h=index,docs.count,store.size&s=index"
curl "http://192.168.1.122:9200/_cluster/health?pretty"
curl "http://192.168.1.122:9200/talon-t9/_count?pretty"
ssh [email protected] "systemctl status fscrawler*"
ssh [email protected] "systemctl restart fscrawler-NAME"
ssh [email protected] "journalctl -u fscrawler-NAME -f"
Each indexed file has this metadata:
{
"file": {
"filename": "example.pdf",
"extension": "pdf",
"filesize": 1048576,
"indexing_date": "2025-12-05T08:00:00.000Z",
"last_modified": "2025-12-01T10:30:00.000Z"
},
"path": {
"real": "/mnt/storage/expansion/Legal/example.pdf",
"root": "/mnt/storage",
"virtual": "/expansion/Legal/example.pdf"
},
"meta": {
"title": "Example Document",
"author": "John Doe"
}
}
mkdir -p /root/.fscrawler/new-crawler-name
cat > /root/.fscrawler/new-crawler-name/_settings.yaml << 'EOF'
---