Log, metric, and trace analysis methodology. Use when analyzing logs, investigating errors, querying metrics, or correlating signals across observability backends (Coralogix, Datadog, CloudWatch).
NEVER start by reading raw logs. Always begin with aggregated statistics:
IMPORTANT: Credentials are injected automatically by a proxy layer. Do NOT check for API keys in environment variables - they won't be there. Just use the backend scripts directly; authentication is handled transparently.
Available backends (invoke with /skill-name):
/observability-coralogix/observability-datadog/observability-honeycomb/observability-splunk/observability-elasticsearch/observability-jaegerTo check if a backend is working, try a simple query rather than checking env vars.
/observability-coralogix - DataPrime syntax, log/trace analysis/observability-datadog - DQL syntax, metrics and APM/observability-honeycomb - High-cardinality analysis, distributed tracing/observability-splunk - SPL syntax, saved searches/observability-elasticsearch - Lucene/Query DSL/observability-jaeger - Distributed tracing, latency analysisWhen reporting observability findings, use this structure:
## Log Analysis Summary
### Time Window
- Start: [timestamp]
- End: [timestamp]
- Duration: X hours
### Statistics
- Total logs: X events
- Error count: Y events (Z%)
- Services affected: N services
- Error rate trend: [increasing/stable/decreasing]
### Top Error Services
1. [service1]: N errors
2. [service2]: M errors
### Error Patterns
- Primary error type: [description]
- First occurrence: [timestamp]
- Correlation: [deployment/traffic/external event]
### Sample Errors
[Quote 2-3 representative error messages with context]
### Root Cause Hypothesis
[Based on patterns observed]
### Confidence Level
[High/Medium/Low with explanation]