Triage and fix failing Buildkite CI checks for RisingWave pull requests by extracting concrete failing tests, diffs, and error messages from Buildkite build/job logs and artifacts, then applying focused code or test fixes. Use when users ask to diagnose red Buildkite checks, investigate flaky/failing CI jobs, understand why a Buildkite build failed, or provide Buildkite/GitHub check URLs and want a concrete fix.
Diagnose Buildkite failures programmatically and avoid guessing from UI screenshots. Prefer structured build/job JSON plus artifact inspection to find the exact failing test case and mismatch, then implement the smallest correct fix.
Resolve triage target with this precedence:
pull-request, main-cron), use the specified scope.Use bk CLI first. If auth is unavailable, use public Buildkite JSON/log/artifact endpoints via curl.
For exact commands and endpoint patterns, read references/buildkite-ci-triage.md.
When no explicit target is given, find the PR for the current branch first, then run gh pr checks <PR_NUMBER> to find failing checks and capture Buildkite URLs (.../builds/<N>).
If user specifies a branch/pipeline, list and filter builds with bk build list using those parameters.
If user provides a Buildkite build URL, skip discovery and start from that build number.
Fetch builds/<N>.json, then list failed jobs by non-zero exit_status.
Capture at least:
Fetch each failed job log and search for high-signal patterns:
query result mismatch[Diff] (-expected|+actual)query is expected to fail with error:Stop once you have one concrete failing file/case and mismatch.
If logs only show wrapper errors (for example, command exited with status), inspect artifacts from the same job, especially:
risedev-logs.ziprisedev-logs/nodetype-*.logExtract and search artifact logs for the exact mismatch.
Prefer minimal fixes tied to evidence:
.slt/.slt.part file only when query output change is intentional.143): treat as infra/cancel unless corroborated by product errors.Avoid broad "retry and hope" actions without root-cause evidence.
Run the narrowest local check that validates the fix when possible. If full validation is not feasible, state it explicitly.
Always report:
105: often wrapper failure from docker-compose/plugin; inspect SLT/e2e logs for true mismatch.4: common in simulation/recovery steps; inspect uploaded simulation logs.143: usually cancellation/termination, not a deterministic product regression.raw_log_url may be null in JSON; use explicit job log endpoints by job id.jq; avoid scraping large HTML pages.