Storage abstractions including segment model, S3/local backends, and caching
The storage layer provides a unified interface for local filesystem and S3 backends with segment-based data organization and optional caching.
pub struct StorageEngine {
pub backend: StorageBackend, // Local or S3
pub cache: Option<SegmentCache>, // For S3 backend
pub data_dir: PathBuf,
}
Created from config — automatically selects backend based on AWS configuration.
| Backend | When | Cache |
|---|---|---|
| LocalBackend | No AWS config | None needed |
| S3Backend | AWS configured | SegmentCache required |
Documents are stored in time-partitioned segments:
data/segments/
├── 2026-02-15/
│ ├── documents.dat # MessagePack-encoded documents (mmap'd)
│ ├── vector.idx # HNSW vector index
│ └── metadata.json # Segment metadata
├── 2026-02-16/
│ └── ...
// Writing
let writer = SegmentWriter::new(data_dir, segment_id)?;
writer.append(&document)?;
writer.seal()?; // Finalize, build indexes
// Reading
let reader = SegmentReader::open(data_dir, segment_id)?;
let docs = reader.scan(predicate)?; // Predicate push-down
segments/ prefixcache_max_gbcrates/storage/src/lib.rs — StorageEngine, discover_local_segmentscrates/storage/src/backend.rs — LocalBackend, S3Backend, StorageBackend enumcrates/storage/src/cache.rs — SegmentCache with LRUcrates/segment/src/ — SegmentWriter, SegmentReader, mmap operations