Add OpenTelemetry tracing spans to Clojure code following Metabase tracing conventions. Use when instrumenting backend code with trace coverage.
This skill helps you add OpenTelemetry (OTel) tracing spans to the Metabase backend codebase using the custom tracing/with-span macro.
src/metabase/tracing/core.clj - with-span macro, group registry, SDK lifecycle, best-effort-sanitize-sql, Pyroscope integrationsrc/metabase/task/impl.clj - defjob macro that wraps Quartz jobs with root spans.clj-kondo/config/modules/config.edn - Module boundary configurationThe tracing module has a deliberately minimal API surface. Only 2 namespaces are public (listed in :api in the module config):
| Namespace | Role | Status |
|---|---|---|
tracing.corePrimary API: with-span, groups, SDK lifecycle, Pyroscope, MDC, best-effort-sanitize-sql |
| Public API |
tracing.init | Side-effect loader for quartz and settings | Public API (init convention) |
tracing.attributes | best-effort-sanitize-sql implementation (re-exported via tracing.core) | Internal |
tracing.settings | Setting definitions (MB_TRACING_* env vars) | Internal |
tracing.quartz | Quartz JDBC proxy + JobListener | Internal |
Rules:
[metabase.tracing.core :as tracing] from outside the module. tracing/best-effort-sanitize-sql and all other public functions are available from this single namespace.tracing.core instead.tracing.attributes, tracing.settings, tracing.quartz) from outside the module.:uses :any on the core module does NOT bypass the target module's :api check — internal namespaces are still enforced.tracing/core.clj is required by many modules across the codebase. It must NOT compile-time require tracing.settings, as this creates transitive cyclic load dependencies (e.g., settings/core -> tracing/settings -> tracing/core -> events/impl -> events/core).
Instead, tracing/core.clj uses requiring-resolve for settings access:
;; CORRECT — lazy runtime resolution, no compile-time dependency
((requiring-resolve 'metabase.tracing.settings/tracing-enabled))
;; WRONG — creates cyclic load dependency
(require '[metabase.tracing.settings :as settings])
(settings/tracing-enabled)
External library namespaces (clj-otel API, SDK, exporters) are safe to require normally — they don't participate in Metabase namespace cycles.
Important: requiring-resolve must use literal quoted symbols. Kondo hooks validate that required-namespaces are all simple symbols, so dynamic construction fails:
;; CORRECT — literal quoted symbol
(requiring-resolve 'metabase.tracing.settings/tracing-endpoint)
;; WRONG — kondo hook rejects this: "Assert failed: (every? simple-symbol? required-namespaces)"
(requiring-resolve (symbol "metabase.tracing.settings" "tracing-endpoint"))
When adding tracing spans:
tracing in its :uses set in .clj-kondo/config/modules/config.edn[metabase.tracing.core :as tracing] to ns requires (alphabetically sorted)src/metabase/tracing/core.clj for registered groups; add a new one if none fit)"domain.subsystem.operation"):search/query-length, :db/id)best-effort-sanitize-sql for HoneySQL, never raw SQL)tracing.core instead)DO_NOT_ADD_NEW_FILES_HERE.txt violations in the target directoryclj-kondo --lint <files> to verify 0 errors, 0 warningstest/ path (see Testing section below)clojure -X:dev:test :only <test-ns>with-span Macro(tracing/with-span group span-name attrs & body)
:tasks, :sync)"search.execute"){:db/id 42})When disabled: zero overhead -- single atom deref + boolean check, body runs directly.
When enabled: creates OTel span AND injects trace_id/span_id into Log4j2 MDC for log-to-trace correlation.
Groups are registered in src/metabase/tracing/core.clj. Check that file for the current list. The general rule: match the group to the domain, not the call site. If code runs inside a Quartz job but is logically search work, use :search, not :tasks.
To add a new group:
;; In src/metabase/tracing/core.clj
(register-group! :my-domain "Description of what this covers")
Users enable groups via MB_TRACING_GROUPS=tasks,search,sync (comma-separated, or "all").
Use dot-separated hierarchical names: "domain.subsystem.operation". The domain prefix should match the group name:
search.execute -- `:search` group
sync.fingerprint.table -- `:sync` group
task.session-cleanup.delete -- `:tasks` group
db-app.collection-items -- `:db-app` group
Use namespaced keywords. The namespace groups related attributes:
:db/id -- Database ID (integer)
:db/engine -- Database engine name (string)
:db/statement -- Sanitized SQL (string, via best-effort-sanitize-sql)
:search/engine -- Search engine name (string)
:search/query-length -- Query string length (integer)
:sync/table -- Table name (string)
:sync/step -- Sync step name (string)
:task/name -- Task name (string)
:http/method -- HTTP method (string)
:http/url -- Request URL (string)
Invent new namespaced attributes as needed (e.g., :pulse/id, :transform/count). Keep values as primitives (strings, numbers, booleans) -- no maps or collections.
Look up the module for your namespace in .clj-kondo/config/modules/config.edn. If tracing is not in the module's :uses set, add it (keep alphabetically sorted):
my-module
{:team "MyTeam"
:uses #{analytics config tracing util}}
(ns metabase.my-module.thing
(:require
[metabase.tracing.core :as tracing]
[metabase.util :as u]))
best-effort-sanitize-sql is available from tracing.core — no additional require needed.
Only wrap code at meaningful I/O boundaries:
DO trace:
DO NOT trace:
t2/select-one :model/Setting :key k)with-span;; Simple span (no attributes needed)
(tracing/with-span :search "search.init-index" {}
(do-expensive-thing))
;; Span with static attributes
(tracing/with-span :sync "sync.fingerprint.table"
{:db/id (:db_id table)
:sync/table (:name table)}
(fingerprint-fields! table fields))
;; Span with computed attributes
(tracing/with-span :search "search.execute"
{:search/engine (name (:search-engine ctx))
:search/query-length (count (:search-string ctx))}
(search.engine/results ctx))
;; Span with sanitized SQL (for dynamic HoneySQL queries)
(let [hsql {:delete-from [(t2/table-name :model/Session)]
:where [:< :created_at oldest-allowed]}]
(tracing/with-span :tasks "task.session-cleanup.delete"
{:db/statement (tracing/best-effort-sanitize-sql hsql)}
(t2/query-one hsql)))
;; Sub-spans breaking a function into I/O phases
(let [embedding (tracing/with-span :search "search.semantic.embedding"
{:search.semantic/provider (:provider model)}
(get-embedding model search-string))
results (tracing/with-span :search "search.semantic.db-query" {}
(into [] xform reducible))]
(process results))
;; Per-item iteration spans
(doseq [e (search.engine/active-engines)]
(tracing/with-span :search "search.ingestion.update" {:search/engine (name e)}
(search.engine/update! e batch)))
Create or update tests in the corresponding test/ path. Follow the patterns in existing tracing tests:
test/metabase/tracing/quartz_test.clj, test/metabase/server/middleware/trace_test.cljtracing/init-enabled-groups! / tracing/shutdown-groups! with try/finally to manage group lifecyclereify mocks for Java interfaces (Connection, PreparedStatement, JobListener, etc.)(set! *warn-on-reflection* true) and type-hint proxy/reify calls to avoid reflection warnings(deftest my-span-enabled-test
(testing "when group is enabled, span is created"
(try
(tracing/init-enabled-groups! "my-group" "INFO")
;; ... test that span behavior occurs ...
(finally
(tracing/shutdown-groups!)))))
(deftest my-span-disabled-test
(testing "when group is disabled, code runs without tracing"
(tracing/shutdown-groups!)
;; ... test that code still works, no wrapping applied ...
))
# Lint modified source and test files — expect 0 errors, 0 warnings
clj-kondo --lint path/to/modified/file.clj path/to/test/file.clj
# Run tests (requires Java 21+)
clojure -X:dev:test :only my-ns.test-ns
Expect: all tests pass, 0 failures, 0 errors, no reflection warnings from your files.
When including SQL in span attributes, always use tracing/best-effort-sanitize-sql. This converts HoneySQL maps to parameterized SQL strings where values become ? placeholders -- no data leaks.
(let [hsql {:delete-from [:core_session]
:where [:< :created_at some-timestamp]}]
(tracing/with-span :tasks "task.cleanup.delete"
{:db/statement (tracing/best-effort-sanitize-sql hsql)}
(t2/query-one hsql)))
;; Trace attribute: db/statement = "DELETE FROM core_session WHERE created_at < ?"
Rules:
best-effort-sanitize-sql only for app DB (HoneySQL) queriesThe defjob macro in metabase.task.impl automatically wraps every Quartz job with a :tasks root span:
(task/defjob ^{DisallowConcurrentExecution true} SessionCleanup [_]
(cleanup-sessions!))
;; Automatically creates span: "task.SessionCleanup" {:task/name "SessionCleanup"}
You do NOT need a root span inside defjob bodies. Add child spans for I/O inside the job.
For code on plain Threads (not Quartz), add the root span manually:
(defn init! []
(tracing/with-span :search "search.task.init" {}
(search/init-index!)))
;; WRONG - pure computation, no I/O
(tracing/with-span :search "search.format-results" {}
(map format-result results))
;; WRONG - trivial single-row lookup
(tracing/with-span :db-app "db-app.get-setting" {}
(t2/select-one :model/Setting :key "my-setting"))
;; WRONG - raw SQL in attributes (data leak)
(tracing/with-span :tasks "task.cleanup" {:db/statement raw-sql-string}
(execute! raw-sql-string))
;; WRONG - wrong group (search work should use :search, not :tasks)
(tracing/with-span :tasks "search.execute" {} ...)
;; WRONG - redundant nesting (do-search already has a span)
(tracing/with-span :search "search.process" {}
(let [results (do-search ctx)]
(tracing/with-span :search "search.format" {}
(format-results results))))
;; WRONG - creating a new tracing namespace
(ns metabase.tracing.my-feature ...)
;; WRONG - requiring internal tracing namespaces from outside the module
(ns metabase.my-module.thing
(:require [metabase.tracing.attributes :as trace-attrs] ;; internal!
[metabase.tracing.settings :as tracing.settings] ;; internal!
[metabase.tracing.quartz :as tracing.quartz])) ;; internal!
;; WRONG - adding compile-time requires to tracing/core.clj for settings or SDK
;; This creates cyclic load dependencies because tracing/core is widely required
(ns metabase.tracing.core
(:require [metabase.tracing.settings :as settings])) ;; causes cycle!
;; WRONG - dynamic symbol construction with requiring-resolve (kondo rejects it)
(requiring-resolve (symbol "metabase.tracing.settings" "tracing-enabled"))
All settings are env-var-only (defined in src/metabase/tracing/settings.clj):
# Core
MB_TRACING_ENABLED=true # Enable tracing (default: false)
MB_TRACING_ENDPOINT=host:4317 # OTLP collector endpoint (default: http://localhost:4317)
MB_TRACING_GROUPS=tasks,search,sync # Comma-separated groups or "all" (default: all)
MB_TRACING_SERVICE_NAME=metabase # Service name in traces (default: hostname)
MB_TRACING_LOG_LEVEL=DEBUG # Log threshold for traced threads: TRACE/DEBUG/INFO (default: INFO)
# Batch span processor tuning
MB_TRACING_MAX_QUEUE_SIZE=2048 # Max spans queued for export; drops when full (default: 2048)
MB_TRACING_EXPORT_TIMEOUT_MS=10000 # Max wait for batch export to complete (default: 10000)
MB_TRACING_SCHEDULE_DELAY_MS=5000 # Delay between consecutive batch exports (default: 5000)
使用 Arthas 的 watch/trace 获取 EagleEye traceId / 获取请求的 traceId