Diagnose and remediate metadata extraction (SDC / Smart Data Capture / Key Pointers) async tasks that are stuck perpetually in IN_PROGRESS state, showing 0% or never advancing. Use when: a customer reports metadata extraction has been running for hours or days without completing; an AsyncTask of type EXTRACT_SUGGESTIONS is stuck with started=NULL; a contract shows extraction 'in progress' and the customer cannot re-trigger it; bulk metadata extraction was triggered and some contracts are stuck. Trigger phrases: 'extraction running forever', 'metadata extraction stuck', 'SDC stuck', 'key pointers not extracting', 'extraction at 0%', 'cannot re-run extraction', 'bulk extraction stuck', 'extraction task perpetually running'. Recurring pattern: hit Prompt Therapy (Feb 2026) and Cedar (March 2026) with identical root cause — bulk extraction on unsupported contract kind/status. Jira: SPD-42477.
Bulk metadata extraction (Key Pointers / SDC) creates AsyncTask records of type EXTRACT_SUGGESTIONS before checking if the contract's contract_kind + display_status combination is supported. When an unsupported contract is included (e.g., DRAFT, VOIDED), the pipeline raises KPExtractionNotAllowedException — but this is uncaught in the outer use case. The started field is never set. The timeout cronjob requires started IS NOT NULL to mark the task TIMED_OUT, so the task stays IN_PROGRESS forever.
The customer cannot re-trigger extraction because there is a one-extraction-per-contract limit that checks for existing IN_PROGRESS tasks.
The stuck cycle:
Bulk extraction → AsyncTask created (started=NULL) → pipeline raises KPExtractionNotAllowedException
→ exception uncaught → started never set → timeout cronjob never fires → stuck forever
→ customer tries re-run → blocked by IN_PROGRESS check
Unsupported contract combinations (from CONTRACT_KIND_DISPLAY_STATUS_MAP_SUPPORTED):
CLICKWRAP — no statuses supportedHISTORICAL_CLICKWRAP — no statuses supportedTEMPLATE with DRAFT, REDLINING, SIGN — only EXECUTED is supportedThis is a recurring pattern. Two confirmed incidents in Q1 2026:
⚠️ Run BQ queries one at a time — parallel
execute_sqlcalls return 500 errors.
Find all stuck EXTRACT_SUGGESTIONS tasks for a workspace (QA India example):
SELECT at.id, at.object_id as contract_id, at.task_status, at.started, at.created,
c.contract_kind, c.status
FROM `spotdraft-qa.qa_india_public.core_asynctask` at
LEFT JOIN `spotdraft-qa.qa_india_public.contracts_v3_contractv3` c ON c.id = at.object_id
WHERE at.task_type = 'EXTRACT_SUGGESTIONS'
AND at.task_status = 'IN_PROGRESS'
AND at.started IS NULL
AND at.workspace_id = {workspace_id}
ORDER BY at.created DESC
LIMIT 50
Scan for stuck tasks across all workspaces (QA):
SELECT at.id, at.object_id as contract_id, at.workspace_id, at.task_status,
at.started, at.created, c.contract_kind, c.status
FROM `spotdraft-qa.qa_india_public.core_asynctask` at
LEFT JOIN `spotdraft-qa.qa_india_public.contracts_v3_contractv3` c ON c.id = at.object_id
WHERE at.task_type = 'EXTRACT_SUGGESTIONS'
AND at.task_status = 'IN_PROGRESS'
AND at.started IS NULL
AND at.created < TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
ORDER BY at.created ASC
LIMIT 100
Check if any metadata values were actually extracted onto the stuck contracts:
SELECT kp.id, kp.contract_id, kp.key, kp.value, kp.created
FROM `spotdraft-qa.qa_india_public.key_pointers_keypointsuggestion` kp
WHERE kp.contract_id IN ({contract_id_1}, {contract_id_2})
ORDER BY kp.created DESC
LIMIT 50
If this returns 0 rows, no data was extracted — no cleanup needed. (This was confirmed for Cedar: VOIDED at 0%, DRAFT never started.)
| Environment | BQ Project | Dataset | Table Prefix |
|---|---|---|---|
| QA India | spotdraft-qa | qa_india_public | (none) |
| QA EU | spotdraft-qa | qa_eu_public | (none) |
| QA USA | spotdraft-qa | qa_usa_public | (none) |
| Prod India | spotdraft-prod | prod_india_db | public_ |
| Prod EU | spotdraft-prod | prod_eu_db | public_ |
| Prod USA | spotdraft-prod | prod_usa_db | public_ |
| Prod MEA | spotdraft-prod | prod_mea_db | (none) |
| Purpose | URL |
|---|---|
| AsyncTask detail | https://api.{cluster}.spotdraft.com/admin/core/asynctask/{task_id}/change/ |
| AsyncTask list by contract | https://api.{cluster}.spotdraft.com/admin/core/asynctask/?q={contract_id} |
| AsyncTask list by workspace | https://api.{cluster}.spotdraft.com/admin/core/asynctask/?created_by_workspace_id={workspace_id} |
| Contract detail | https://api.{cluster}.spotdraft.com/admin/contracts_v3/contractv3/{contract_id}/change/ |
Search Sentry for KPExtractionNotAllowedException to find affected contracts: https://spotdraft.sentry.io/issues/?query=KPExtractionNotAllowedException
Use when:
EXTRACT_SUGGESTIONS, status IN_PROGRESS, started is NULLDo not use for:
started IS NOT NULL → check pipeline/beam logsGet the workspace ID and cluster from support. Check the admin panel: