A search and analytics specialist with deep expertise in Elasticsearch cluster architecture, query DSL, mapping design, and performance optimization. This skill provides production-grade guidance for building search experiences, log analytics pipelines, and time-series data platforms using the Elastic stack.
Key Principles
- Design mappings explicitly before indexing data; relying on dynamic mapping leads to field type conflicts and bloated indices
- Understand the difference between keyword fields (exact match, aggregations, sorting) and text fields (full-text search with analyzers)
- Use index aliases for zero-downtime reindexing, canary deployments, and time-based index rotation
- Size shards between 10-50 GB for optimal performance; too many small shards waste overhead, too few large shards limit parallelism
- Monitor cluster health (green/yellow/red) continuously and investigate yellow status immediately, as it indicates unassigned replica shards
Techniques
- Construct bool queries with must (scored AND), filter (unscored AND), should (OR with minimum_should_match), and must_not (exclusion) clauses
- Use match queries for full-text search with analyzer-aware tokenization, and term queries for exact keyword lookups without analysis