Name: Elasticsearch 8.x Patterns
Author: srijan-at-qwertystars

Elasticsearch 8.x Patterns | Skills Pool

PUT /products
{
  "settings": { "number_of_shards": 3, "number_of_replicas": 1, "refresh_interval": "5s" },
  "mappings": { "properties": {
    "name":        { "type": "text", "analyzer": "standard" },
    "sku":         { "type": "keyword" },
    "price":       { "type": "scaled_float", "scaling_factor": 100 },
    "description": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 }}},
    "created_at":  { "type": "date", "format": "strict_date_optional_time||epoch_millis" },
    "location":    { "type": "geo_point" },
    "tags":        { "type": "keyword" },
    "metadata":    { "type": "object" },
    "variants":    { "type": "nested", "properties": { "color": { "type": "keyword" }, "size": { "type": "keyword" }}},
    "embedding":   { "type": "dense_vector", "dims": 768, "index": true, "similarity": "cosine" }
  }}
}

POST /_aliases
{ "actions": [
  { "add": { "index": "products-v2", "alias": "products" }},
  { "remove": { "index": "products-v1", "alias": "products" }}
]}

PUT _component_template/base_settings
{ "template": { "settings": { "number_of_shards": 2, "number_of_replicas": 1 }}}

PUT _index_template/logs_template
{ "index_patterns": ["logs-*"], "data_stream": {}, "composed_of": ["base_settings"], "priority": 200 }

PUT _ilm/policy/logs_policy
{ "policy": { "phases": {
  "hot":    { "actions": { "rollover": { "max_size": "50gb", "max_age": "7d" }}},
  "warm":   { "min_age": "30d", "actions": { "shrink": { "number_of_shards": 1 }, "forcemerge": { "max_num_segments": 1 }}},
  "cold":   { "min_age": "90d", "actions": { "searchable_snapshot": { "snapshot_repository": "my_repo" }}},
  "delete": { "min_age": "365d", "actions": { "delete": {} }}
}}}

// Full-text (analyzed):
{ "query": { "match": { "description": { "query": "wireless bluetooth", "operator": "and" }}}}
// Exact (keyword fields only, NEVER on text):
{ "query": { "term": { "sku": { "value": "ABC-123" }}}}

{ "query": { "bool": {
  "must":     [{ "match": { "name": "laptop" }}],
  "filter":   [{ "range": { "price": { "gte": 500, "lte": 2000 }}}, { "term": { "in_stock": true }}],
  "should":   [{ "match": { "description": "gaming" }}],
  "must_not": [{ "term": { "brand": "excluded_brand" }}],
  "minimum_should_match": 1
}}}

{ "query": { "multi_match": { "query": "search terms", "fields": ["name^3", "description"], "type": "best_fields" }}}

{ "query": { "nested": { "path": "variants", "query": { "bool": { "must": [
  { "term": { "variants.color": "red" }}, { "term": { "variants.size": "L" }}
]}}}}}

{ "query": { "function_score": {
  "query": { "match": { "name": "shoes" }},
  "functions": [
    { "field_value_factor": { "field": "popularity", "modifier": "log1p", "factor": 2 }},
    { "gauss": { "location": { "origin": "40.7,-74.0", "scale": "5km" }}}
  ],
  "boost_mode": "multiply", "score_mode": "sum"
}}}

PUT /my_index
{ "settings": { "analysis": {
  "analyzer": { "my_custom": {
    "type": "custom", "tokenizer": "standard",
    "char_filter": ["html_strip"],
    "filter": ["lowercase", "asciifolding", "my_synonym", "my_stop"]
  }},
  "filter": {
    "my_synonym": { "type": "synonym", "synonyms": ["laptop,notebook", "phone,mobile"] },
    "my_stop": { "type": "stop", "stopwords": "_english_" }
  }
}}}

// Test analyzer:
POST /_analyze
{ "analyzer": "standard", "text": "The Quick Brown Fox" }
// Output tokens: ["the", "quick", "brown", "fox"]

// Mapping:
{ "properties": { "title": { "type": "search_as_you_type" }}}
// Query:
{ "query": { "multi_match": { "query": "elast", "type": "bool_prefix",
  "fields": ["title", "title._2gram", "title._3gram"] }}}

{ "size": 0, "aggs": { "by_brand": {
  "terms": { "field": "brand", "size": 20 },
  "aggs": { "avg_price": { "avg": { "field": "price" }}}
}}}
// Response buckets: [{ "key": "Apple", "doc_count": 150, "avg_price": { "value": 1299.5 }}, ...]

{ "size": 0, "aggs": { "over_time": {
  "date_histogram": { "field": "created_at", "calendar_interval": "month" },
  "aggs": { "revenue": { "sum": { "field": "price" }}}
}}}

{ "size": 0, "aggs": { "my_composite": { "composite": { "size": 100, "sources": [
  { "brand": { "terms": { "field": "brand" }}},
  { "category": { "terms": { "field": "category" }}}
]}}}}
// Next page: add "after": { "brand": "...", "category": "..." } from previous response

// Nested:
{ "aggs": { "variants": { "nested": { "path": "variants" },
  "aggs": { "colors": { "terms": { "field": "variants.color" }}}}}}
// Pipeline:
{ "size": 0, "aggs": {
  "monthly": { "date_histogram": { "field": "date", "calendar_interval": "month" },
    "aggs": { "total": { "sum": { "field": "amount" }}}},
  "max_monthly": { "max_bucket": { "buckets_path": "monthly>total" }}
}}

Method	Use	Limit
`from`/`size`	UI paging, small sets	10,000 max
`search_after`	Deep pagination, stateless	Needs sort values
PIT + `search_after`	Consistent deep pagination	Preferred in 8.x
Scroll	Batch export	Avoid for user-facing

// Open PIT, search with sort, pass search_after for next page, close PIT:
POST /products/_pit?keep_alive=5m   // returns { "id": "abc..." }
POST /_search
{ "size": 100, "pit": { "id": "abc...", "keep_alive": "5m" },
  "sort": [{ "created_at": "desc" }, { "_shard_doc": "asc" }] }
// Next: add "search_after": [<last_sort_values>]
DELETE /_pit { "id": "abc..." }

POST /_bulk
{"index":{"_index":"products","_id":"1"}}
{"name":"Widget","price":9.99}
{"delete":{"_index":"products","_id":"3"}}
{"update":{"_index":"products","_id":"1"}}
{"doc":{"price":8.99}}

// Reindex with optional pipeline:
POST /_reindex { "source": { "index": "v1" }, "dest": { "index": "v2", "pipeline": "enrich" }}
// Update by query:
POST /products/_update_by_query
{ "query": { "term": { "status": "draft" }}, "script": { "source": "ctx._source.status = 'published'" }}

from elasticsearch import Elasticsearch
es = Elasticsearch("https://localhost:9200", api_key="key", ca_certs="/path/to/http_ca.crt")
es.index(index="products", id="1", document={"name": "Widget", "price": 9.99})
resp = es.search(index="products", query={"match": {"name": "widget"}})

from elasticsearch.helpers import bulk
actions = [{"_index": "products", "_id": i, "_source": {"name": f"Item {i}"}} for i in range(1000)]
success, errors = bulk(es, actions, chunk_size=500, raise_on_error=False)

import { Client } from '@elastic/elasticsearch';
const client = new Client({ node: 'https://localhost:9200', auth: { apiKey: 'key' },
  tls: { ca: fs.readFileSync('/path/to/http_ca.crt') }});
const result = await client.search({ index: 'products', query: { match: { name: 'widget' } } });
const { errors } = await client.bulk({ operations: items.flatMap(d => [{ index: { _index: 'products' } }, d]) });

es, _ := elasticsearch.NewClient(elasticsearch.Config{
  Addresses: []string{"https://localhost:9200"}, APIKey: "key",
})
res, _ := es.Search(es.Search.WithIndex("products"),
  es.Search.WithBody(strings.NewReader(`{"query":{"match":{"name":"widget"}}}`)))

// Create API key with scoped permissions:
POST /_security/api_key
{ "name": "backend-key", "expiration": "90d", "role_descriptors": {
  "reader": { "cluster": ["monitor"], "index": [{ "names": ["products*"], "privileges": ["read"] }] }
}}
// Use: Authorization: ApiKey <encoded>

// Role with field-level + document-level security:
POST /_security/role/pii_restricted
{ "indices": [{ "names": ["users*"], "privileges": ["read"],
  "field_security": { "grant": ["name", "email"], "except": ["ssn"] },
  "query": { "term": { "department": "engineering" }}
}]}

GET _cluster/health            // green/yellow/red status
GET _cat/indices?v&s=store.size:desc&h=index,health,pri,rep,docs.count,store.size
GET _cat/shards?v&s=store:desc
GET _cat/nodes?v&h=name,heap.percent,ram.percent,cpu,load_1m
GET _cat/thread_pool?v&h=node_name,name,active,queue,rejected
GET _nodes/hot_threads         // high CPU diagnosis
GET _tasks?actions=*search&detailed

PUT /my_index/_settings
{ "index.search.slowlog.threshold.query.warn": "5s",
  "index.search.slowlog.threshold.query.info": "2s",
  "index.indexing.slowlog.threshold.index.warn": "10s" }

// Mapping:
{ "properties": { "embedding": { "type": "dense_vector", "dims": 768, "index": true, "similarity": "cosine" }}}

// kNN search:
POST /my_index/_search
{ "knn": { "field": "embedding", "query_vector": [0.1, 0.2], "k": 10, "num_candidates": 100 },
  "fields": ["title"] }

// Hybrid (kNN + text):
{ "query": { "match": { "content": "machine learning" }},
  "knn": { "field": "embedding", "query_vector": [0.1, 0.2], "k": 10, "num_candidates": 100, "boost": 0.5 }}

// Quantized index for scale (int8 reduces memory ~4x):
{ "properties": { "embedding": { "type": "dense_vector", "dims": 768, "index": true,
  "similarity": "cosine", "index_options": { "type": "int8_hnsw" }}}}

FROM logs-*
| WHERE @timestamp >= NOW() - 24 hours AND level == "error"
| STATS error_count = COUNT(*) BY service.name
| SORT error_count DESC
| LIMIT 10

POST /_query
{ "query": "FROM products | WHERE price > 100 | STATS avg_price = AVG(price) BY brand | SORT avg_price DESC" }

File	Covers
advanced-patterns.md	ILM policies, data streams, CCS/CCR, snapshot/restore, searchable snapshots, runtime fields, field aliases, ingest pipelines (grok/dissect/enrich), transforms, rollups, async search, PIT API, ES\|QL deep dive, vector search (HNSW tuning, quantization), ELSER semantic search, relevance tuning (function_score, rescoring, LTR)
troubleshooting.md	Cluster yellow/red diagnosis, unassigned shards, allocation failures, disk watermarks, mapping explosion, field limits, circuit breaker errors, slow log analysis, indexing bottlenecks, GC pressure, node disconnections, split brain, upgrade issues, reindex failures, analyzer debugging
operations-guide.md	Cluster sizing (nodes/shards/heap), capacity planning, rolling upgrades, index template versioning, alias-based zero-downtime reindexing, backup strategies (SLM), monitoring (cluster/node/index stats, cat APIs), Watcher/Kibana alerting, security hardening, audit logging, hot-warm-cold-frozen architecture

Script	Purpose	Usage
es-health-check.sh	Cluster diagnostics: health, nodes, shards, disk, thread pools	`./scripts/es-health-check.sh [ES_URL]`
index-management.sh	Create/delete/reindex indices, alias swaps	`./scripts/index-management.sh create myindex --shards 3`
es-local.sh	Local ES 8.x + Kibana via Docker, with sample data	`./scripts/es-local.sh start && ./scripts/es-local.sh seed`

File	Description
docker-compose.yml	ES 8.x + Kibana dev environment (optional Logstash)
index-template.json	Production index template with mappings, settings, ILM, analyzers
ingest-pipeline.json	Log processing pipeline: grok, dissect, date, GeoIP, user-agent
search-template.json	Parameterized search with highlighting, aggregations, facets
ilm-policy.json	Hot→warm→cold→frozen→delete ILM policy

Elasticsearch 8.x Patterns

Architecture

Index Management

Create index with mappings and settings

Elasticsearch 8.x Patterns

Architecture

Index Management

Create index with mappings and settings

Aliases for zero-downtime reindexing

Index templates and data streams

ILM (Index Lifecycle Management)

Query DSL

Match (full-text) and Term (exact)

Bool query

Multi-match

Nested query (required for nested type fields)

Function score

Full-Text Search and Analyzers

search_as_you_type (autocomplete)

Aggregations

Terms with sub-aggregation

Date histogram

Composite (paginated aggs)

Nested agg and pipeline agg

Pagination

Bulk Operations

Client Libraries

Python (elasticsearch-py)

Node.js (@elastic/elasticsearch)

Go (elastic/go-elasticsearch v8)

Security

Performance Tuning

Observability

Slow log

Vector Search / kNN

ES|QL

ELK Stack

Common Pitfalls

Additional Resources

Reference Guides (references/)

Scripts (scripts/)

Templates (assets/)

Vector Index Tuning

Azure Resource Manager Redis Dotnet

Redis Expert

Elasticsearch

Cache Expert

Abp Mongodb

Nested query (required for `nested` type fields)

Reference Guides (`references/`)

Scripts (`scripts/`)

Templates (`assets/`)