Logging and Observability

The Three Pillars

Logs — discrete events with context (what happened)
Metrics — numeric measurements over time (how much/how fast)
Traces — request flow across services (where time was spent)

All three should be correlated via a shared request/trace ID.

Structured Logging

Always log in structured format (JSON in production). Unstructured text logs are impossible to query at scale.

TypeScript (pino)

import pino from "pino";

const logger = pino({
  level: process.env.LOG_LEVEL || "info",
  formatters: {
    level: (label) => ({ level: label }),
  },
  timestamp: pino.stdTimeFunctions.isoTime,
  redact: ["req.headers.authorization", "body.password", "body.ssn"],
});

// Create child loggers with bound context
const requestLogger = logger.child({
  requestId: req.id,
  userId: req.user?.id,
  service: "order-service",
});

requestLogger.info({ orderId, itemCount: items.length }, "Order created");
// Output: {"level":"info","time":"2026-01-15T10:30:00.000Z","requestId":"abc123","userId":"u_456","service":"order-service","orderId":"ord_789","itemCount":3,"msg":"Order created"}

Logging and Observability

The Three Pillars

Logs — discrete events with context (what happened)
Metrics — numeric measurements over time (how much/how fast)
Traces — request flow across services (where time was spent)

All three should be correlated via a shared request/trace ID.

Structured Logging

Always log in structured format (JSON in production). Unstructured text logs are impossible to query at scale.

TypeScript (pino)

import pino from "pino";

const logger = pino({
  level: process.env.LOG_LEVEL || "info",
  formatters: {
    level: (label) => ({ level: label }),
  },
  timestamp: pino.stdTimeFunctions.isoTime,
  redact: ["req.headers.authorization", "body.password", "body.ssn"],
});

// Create child loggers with bound context
const requestLogger = logger.child({
  requestId: req.id,
  userId: req.user?.id,
  service: "order-service",
});

requestLogger.info({ orderId, itemCount: items.length }, "Order created");
// Output: {"level":"info","time":"2026-01-15T10:30:00.000Z","requestId":"abc123","userId":"u_456","service":"order-service","orderId":"ord_789","itemCount":3,"msg":"Order created"}

Level	Purpose	Example
FATAL	Application cannot continue; process will exit	Database connection pool exhausted, unrecoverable state
ERROR	Operation failed; requires attention	Payment processing failed, external API returned 5xx
WARN	Something unexpected but handled; may indicate a problem	Retry succeeded after failure, deprecated API called, cache miss fallback
INFO	Significant business events; normal operation milestones	Order placed, user signed up, deployment started
DEBUG	Detailed diagnostic information for troubleshooting	SQL query with params, HTTP request/response bodies, cache hit/miss
TRACE	Very fine-grained; rarely enabled in production	Function entry/exit, loop iterations

Logging Observability

Logging and Observability

The Three Pillars

Structured Logging

TypeScript (pino)

Logging Observability

Logging and Observability

The Three Pillars

Structured Logging

TypeScript (pino)

Python (structlog)

Go (slog)

Log Levels — When to Use Each

Correlation IDs

Distributed Tracing (OpenTelemetry)

Setup

Custom Spans

Metrics

RED Method (Request-oriented — for services)

USE Method (Resource-oriented — for infrastructure)

Prometheus Metrics (Node.js)

Alerting Rules

Bluebubbles

Add Tracing

Analytics Events

Add Expert

Arthas

Arthas Eagleeye Traceid