Name: Observability - Logging, Monitoring & Tracing
Author: duylinhdang1998

Purpose: Implement comprehensive observability for production systems with logs, metrics, and distributed tracing

Agent: Google SRE / Netflix Backend Architect Use When: Setting up monitoring, debugging production issues, or ensuring system reliability

Three Pillars of Observability

1. Logging (What happened?)

2. Metrics (How much/how fast?)

3. Tracing (Where did it happen?)

1. Structured Logging

import pino from 'pino'

const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
  formatters: {
    level: (label) => ({ level: label })
  }
})

// Structured logs (JSON)
logger.info({ userId: 123, action: 'login' }, 'User logged in')
logger.error({ error: err, userId: 123 }, 'Failed to process payment')

// Request logging middleware
app.use((req, res, next) => {
  req.log = logger.child({
    requestId: crypto.randomUUID(),
    method: req.method,
    url: req.url,
    ip: req.ip
  })

  req.log.info('Request started')

  res.on('finish', () => {
    req.log.info({
      statusCode: res.statusCode,
      duration: Date.now() - req.startTime
    }, 'Request completed')
  })

  next()
})

Observability - Logging, Monitoring & Tracing

Observability - Logging, Monitoring & Tracing

Three Pillars of Observability

1. Logging (What happened?)

2. Metrics (How much/how fast?)

3. Tracing (Where did it happen?)

1. Structured Logging

2. Metrics (Prometheus + Grafana)

3. Distributed Tracing (OpenTelemetry)

4. Application Performance Monitoring (APM)

5. Health Checks

6. Alerting

Bluebubbles

Add Tracing

Analytics Events

Add Expert

Arthas

Arthas Eagleeye Traceid