This skill should be used when the user asks to "write LLMObs tests", "add tests for LLM Observability", "test an LLMObs plugin", "llmobs test", "llmobs spec", "test llm observability", "assertLlmObsSpanEvent", "useLlmObs", "getEvents", "MOCK_STRING", "MOCK_NOT_NULLISH", "MOCK_NUMBER", "MOCK_OBJECT", "VCR cassette", "record cassette", "replay cassette", "vcr proxy", "llmobs cassette", "test chat completions", "test streaming", "test embeddings", "test agent runs", "test orchestration", "test workflow", "llmobs span event", "LLMObs test strategy", "LlmObsCategory test", "LLM_CLIENT test", "MULTI_PROVIDER test", "ORCHESTRATION test", "INFRASTRUCTURE test", "span kind llm test", "span kind workflow test", "inputMessages", "outputMessages", "token metrics", "llmobs span validation", "cassette not generated", "re-record cassette", "127.0.0.1:9126", or needs to write, modify, or debug tests for any LLMObs plugin in dd-trace-js.
BEFORE writing any test, you MUST determine the package category.
The category determines EVERYTHING:
IF YOU USE THE WRONG CATEGORY STRATEGY, THE TEST WILL FAIL.
Categories are defined in the LlmObsCategory enum.
Quick check:
LLM_CLIENT or MULTI_PROVIDER (use VCR)ORCHESTRATION (NO VCR, pure functions)INFRASTRUCTURE (mock servers)See references/category-strategies.md for FORBIDDEN vs REQUIRED patterns per category.
This skill helps you write comprehensive LLMObs tests that validate span events, messages, tokens, and metadata using category-appropriate strategies.
LLMObs tests use special helpers to validate span events.
Key components:
useLlmObs() - Initializes LLMObs test environmentgetEvents() - Retrieves captured span eventsassertLlmObsSpanEvent() - Validates span structure with flexible matchersBasic test flow:
useLlmObs({ plugin: 'name' })getEvents()assertLlmObsSpanEvent()See references/test-structure.md for complete test file templates.
VCR records real API calls and replays them in tests for deterministic testing without external dependencies.
Purpose:
How it works:
http://127.0.0.1:9126/vcr/{provider}Cassette location: test/llmobs/plugins/{integration}/cassettes/
When to use VCR:
LlmObsCategory.LLM_CLIENT (Direct API wrappers)LlmObsCategory.MULTI_PROVIDER (Multi-provider frameworks)LlmObsCategory.ORCHESTRATION (Pure functions, no API calls)LlmObsCategory.INFRASTRUCTURE (Mock servers instead)See references/vcr-cassettes.md for recording process and troubleshooting.
Test strategy is determined by the LlmObsCategory enum.
Strategy: VCR with real API calls via proxy
Characteristics:
Span kind: Usually 'llm' for chat completions
See references/category-strategies.md for detailed patterns.
Strategy: Pure function tests, NO VCR, NO real API calls
Characteristics:
Span kind: Usually 'workflow' or 'agent', NOT 'llm'
Example concept:
See references/category-strategies.md for orchestration test patterns.
Strategy: Mock server tests
Characteristics:
See references/category-strategies.md for infrastructure test patterns.
assertLlmObsSpanEvent(actual, expected)
Validates span structure with flexible matchers for non-deterministic values.
Available matchers:
MOCK_STRING - Matches any non-empty string (use for output text)MOCK_NOT_NULLISH - Matches any truthy value (use for token counts)MOCK_NUMBER - Matches any numberMOCK_OBJECT - Matches any object (use for errors)Assertable fields:
spanKind (required) - Span type from LlmObsSpanKind enumname - Operation namemodelName - Model identifier (for LLM spans)modelProvider - Provider name (for LLM spans)inputMessages - Input messages in [{content, role}] formatoutputMessages - Output messages in [{content, role}] formatmetrics - Token usage (input_tokens, output_tokens, total_tokens)metadata - Model parameters (temperature, max_tokens, etc.)error - Error object (if operation failed)Partial validation: Only specified fields are checked, others ignored.
See references/assertion-helpers.md for complete API and patterns.
Location: test/llmobs/plugins/{integration}/index.spec.js
Structure:
'../../util'beforeEach() for fresh statedescribe('chat completions', ...))Standard imports:
useLlmObs, assertLlmObsSpanEvent, MOCK_STRING, MOCK_NOT_NULLISH, MOCK_NUMBER, MOCK_OBJECT
See references/test-structure.md for complete template.
Test all instrumented methods with:
{content, role} structure)Match span kind to operation type using LlmObsSpanKind enum:
'llm''workflow''agent''tool''embedding''retrieval'On errors, validate:
[{content: '', role: ''}]error: MOCK_OBJECTFor detailed information, see:
LlmObsCategory to pick test approach{content, role} structure