Name: Elasticsearch Esql
Author: elastic

Elasticsearch Esql | Skills Pool

INLINE STATS

RATE()

node scripts/esql.js indices                    # List all indices
node scripts/esql.js indices "logs-*"           # List matching indices
node scripts/esql.js schema "logs-2024.01.01"   # Get field mappings for an index

node scripts/esql.js raw "FROM logs-* | STATS count = COUNT(*) BY host.name | SORT count DESC | LIMIT 5"

node scripts/esql.js raw "FROM logs-* | STATS count = COUNT(*) BY component | SORT count DESC" --tsv

node scripts/esql.js test

Detect deployment type: Always run node scripts/esql.js test first. This detects whether the deployment is a Serverless project (all features available) or a versioned cluster (features depend on version). The build_flavor field from GET / is the authoritative signal — if it equals "serverless", ignore the reported version number and use all ES|QL features freely.
Discover schema (required — never guess index or field names):
```
node scripts/esql.js indices "pattern*"
node scripts/esql.js schema "index-name"
```
Always run schema discovery before generating queries. Index names and field names vary across deployments and cannot be reliably guessed. Even common-sounding data (e.g., "logs") may live in indices named logs-test, logs-app-*, or application_logs. Field names may use ECS dotted notation (source.ip, service.name) or flat custom names — the only way to know is to check.

Prefer simplicity: Query a single index unless the user explicitly asks for data across multiple sources. Do not combine indices with different schemas using COALESCE unless specifically requested — pick the single most relevant index for the question. When multiple indices contain similar data, prefer the one with the most complete schema for the task at hand.

The schema command reports the index mode. If it shows Index mode: time_series, the output includes the data stream name and copy-pasteable TS syntax — use TS <data-stream> (not FROM), TBUCKET(interval) (not DATE_TRUNC), and wrap counter fields with SUM(RATE(...)). Read the full TS section in Generation Tips before writing any time series query. You can also check the index mode directly via the Elasticsearch index settings API:
```
curl -s "$ELASTICSEARCH_URL/<index-name>/_settings/index.mode" -H "Authorization: ApiKey $ELASTICSEARCH_API_KEY"
```
Choose the right ES|QL feature for the task: Before writing queries, match the user's intent to the most appropriate ES|QL feature. Prefer a single advanced query over multiple basic ones.
- "find patterns," "categorize," "group similar messages" → CATEGORIZE(field)
- "spike," "dip," "anomaly," "when did X change" → CHANGE_POINT value ON key
- "trend over time," "time series" → STATS ... BY BUCKET(@timestamp, interval) or TS for TSDB
- "search," "find documents matching" → MATCH (default), QSTR (advanced boolean), KQL (Kibana migration). For content/document relevance search, follow the ES|QL Search Strategy
- "count," "average," "breakdown" → STATS with aggregation functions
Read the references before generating queries:
- Generation Tips - key patterns (TS/TBUCKET/RATE, per-agg WHERE, LOOKUP JOIN, CIDR_MATCH), common templates, and ambiguity handling
- Time Series Queries - read before any TS query: inner/outer aggregation model, TBUCKET syntax, RATE constraints
- ES|QL Complete Reference - full syntax for all commands and functions
- ES|QL Search Strategy — for content/document relevance search (retrieve → fuse → rerank)
- ES|QL Search Reference — for full-text search function syntax (MATCH, QSTR, KQL, scoring)
Generate the query following ES|QL syntax. Prefer the simplest query that answers the question — do not add extra indices, fields, or transformations unless the user asks for them. Only include fields in KEEP that directly answer the question. Do not add extra filter conditions beyond what the user specified (e.g., don't add OR level == "ERROR" when the user just said "errors").
- Start with FROM index-pattern (or TS index-pattern for time series indices)
- Add WHERE for filtering (use TRANGE for time ranges on 9.3+)
- Use EVAL for computed fields
- Use STATS ... BY for aggregations
- For time series metrics: TS with SUM(RATE(...)) for counters, AVG(...) for gauges, and TBUCKET(interval) for time bucketing — see the TS section in Generation Tips for the three critical syntax rules
- For detecting spikes, dips, or anomalies, use CHANGE_POINT after time-bucketed aggregation
- Add SORT and LIMIT as needed

Execute with TSV flag:

node scripts/esql.js raw "FROM index | STATS count = COUNT(*) BY field" --tsv

FROM index-pattern
| WHERE condition
| EVAL new_field = expression
| STATS aggregation BY grouping
| SORT field DESC
| LIMIT n

FROM logs-*
| WHERE @timestamp > NOW() - 24 hours AND level == "error"
| SORT @timestamp DESC
| LIMIT 100

FROM metrics-*
| WHERE @timestamp > NOW() - 7 days
| STATS avg_cpu = AVG(cpu.percent) BY bucket = DATE_TRUNC(1 hour, @timestamp)
| SORT bucket DESC

FROM web-logs
| STATS count = COUNT(*) BY response.status_code
| SORT count DESC
| LIMIT 10

FROM documents METADATA _score
| WHERE MATCH(content, "search terms")
| SORT _score DESC
| LIMIT 20

// Extract domain from email using DISSECT (preferred — produces named fields)
FROM customers
| DISSECT email "%{local}@%{domain}"
| STATS count = COUNT(*) BY domain

// Alternative: extract domain from email using SPLIT
FROM customers
| EVAL domain = MV_LAST(SPLIT(email, "@"))
| STATS count = COUNT(*) BY domain

// Parse HTTP log lines
FROM logs-*
| DISSECT message "%{method} %{path} %{status_text}"
| KEEP @timestamp, method, path, status_text

FROM logs-*
| WHERE @timestamp > NOW() - 24 hours
| STATS count = COUNT(*) BY category = CATEGORIZE(message)
| SORT count DESC
| LIMIT 20

FROM logs-*
| STATS c = COUNT(*) BY t = BUCKET(@timestamp, 30 seconds)
| SORT t
| CHANGE_POINT c ON t
| WHERE type IS NOT NULL

// Counter metric: SUM(RATE(...)) with TBUCKET(duration)
TS metrics-tsds
| WHERE TRANGE(1 hour)
| STATS SUM(RATE(requests)) BY TBUCKET(1 hour), host

// Gauge metric: AVG(...) — no RATE needed
TS metrics-tsds
| STATS avg_cpu = AVG(cpu) BY service.name, bucket = TBUCKET(5 minutes)
| SORT bucket

// Field name mismatch — RENAME before joining
FROM support_tickets
| RENAME product AS product_name
| LOOKUP JOIN knowledge_base ON product_name

// Aggregate, limit, THEN enrich (top-N only)
FROM orders
| STATS total_spent = SUM(total) BY customer_id
| SORT total_spent DESC
| LIMIT 3
| LOOKUP JOIN customers_lookup ON customer_id
| KEEP name, customer_id, total_spent

// Multi-field join (9.2+)
FROM application_logs
| LOOKUP JOIN service_registry ON service_name, environment
| KEEP service_name, environment, owner_team

// Filter by multivalue membership
FROM employees
| WHERE MV_CONTAINS(languages, "Python")

// Find entries matching multiple values
FROM employees
| WHERE MV_CONTAINS(languages, "Java") AND MV_CONTAINS(languages, "Python")

// Count multivalue entries
FROM employees
| EVAL num_languages = MV_COUNT(languages)
| SORT num_languages DESC

FROM logs-*
| STATS error_count = COUNT(*) BY bucket = DATE_TRUNC(1 hour, @timestamp)
| SORT bucket
| CHANGE_POINT error_count ON bucket AS type, pvalue

# Schema discovery
node scripts/esql.js test
node scripts/esql.js indices "logs-*"
node scripts/esql.js schema "logs-2024.01.01"

# Execute queries
node scripts/esql.js raw "FROM logs-* | STATS count = COUNT(*) BY host.name | LIMIT 10"
node scripts/esql.js raw "FROM metrics-* | STATS avg = AVG(cpu.percent) BY hour = DATE_TRUNC(1 hour, @timestamp)" --tsv

Elasticsearch Esql

What is ES|QL?

Elasticsearch Esql

What is ES|QL?

Environment Configuration

Usage

Get Index Information (for schema discovery)

Execute Raw ES|QL

Execute with TSV Output

Test Connection

Guidelines

ES|QL Quick Reference

Basic Structure

Common Patterns

Full Reference

Error Handling

Examples

Visualization Expert

Data Analyst

Huggingface Hub

Multi Reviewer Patterns

Dbt Transformation Patterns

Startup Financial Modeling