Define and implement LLM-serving observability, including TTFT, TPOT, queue time, KV-cache utilization, and rejection reason telemetry. Use when instrumenting gateway/scheduler code paths, creating Prometheus metrics, or generating Grafana dashboards and load-test analysis artifacts.