Triages GitHub issues by routing to oncall teams, applying labels, and closing questions. Use when processing new PyTorch issues or when asked to triage an issue.
This skill helps triage GitHub issues by routing issues, applying labels, and leaving first-line responses.
Labels reference: See labels.json for the full catalog of 305 labels suitable for triage. ONLY apply labels that exist in this file. Do not invent or guess label names. This file excludes CI triggers, test configs, release notes, and deprecated labels.
PT2 triage guide: See pt2-triage-rubric.md for detailed labeling guidance when triaging PT2/torch.compile issues.
Response templates: See templates.json for standard response messages.
Use these GitHub MCP tools for triage:
| Tool | Purpose |
|---|---|
mcp__github__issue_read | Get issue details, comments, and existing labels |
mcp__github__issue_write | Apply labels or close issues |
mcp__github__add_issue_comment | Add comment (only for redirecting questions) |
mcp__github__search_issues | Find similar issues for context |
| Prefix/Category | Reason |
|---|---|
Labels not in labels.json | Only apply labels that exist in the allowlist |
ciflow/* | CI job triggers for PRs only |
test-config/* | Test suite selectors for PRs only |
release notes: * | Auto-assigned for release notes |
ci-*, ci:* | CI infrastructure controls |
sev* | Severity labels require human decision |
merge blocking | Requires human decision |
| Any label containing "deprecated" | Obsolete |
oncall: releng | Not a triage redirect target. Use module: ci instead |
If blocked: When a label is blocked by the hook, add ONLY triage review and stop. A human will handle it.
These rules are enforced by a PreToolUse hook that validates all labels against labels.json.
If a human has already applied labels (especially ci: sev, severity labels, or priority labels), do NOT remove or replace them. Your job is to supplement, not override.
If an issue already has ANY oncall: label, SKIP IT entirely. Do not:
triagedThat issue belongs to the sub-oncall team. They own their queue.
redirect_to_forum template from templates.json.request_more_info template and stop.Check if the issue body contains links to external files that users would need to download to reproduce.
Patterns to detect:
.zip, .pt, .pth, .pkl, .safetensors, .onnx, .bin filesAction:
[Link removed - external file downloads are not permitted for security reasons]needs reproduction labelneeds_reproduction template from templates.json to request a self-contained reproductiontriaged — wait for the user to provide a reproducible exampleAlso add needs reproduction when:
If the issue involves extremal values or numerical precision differences:
Patterns to detect:
torch.finfo(dtype).max or torch.finfo(dtype).minIMPORTANT — avoid keyword-triggered mislabeling:
Label based on the root cause, not keywords that appear in the error or title. A keyword tells you what failed, not why.
undefined symbol: ncclAlltoAll error at import torch is a packaging issue (module: binaries), not a distributed training bug — the user never ran distributed code.nan in a parameter name or tolerance check is not module: NaNs and Infs unless the bug is actually about NaN propagation.autograd does not mean module: autograd — check whether the bug is in autograd itself or just on the call path.module: tests, not module: numerical-stability.Ask: "Where would the fix need to be made?" That determines the label.
Action:
module: edge cases labeltopic: fuzzernumerical_accuracy template from templates.json to link to the docsIf the issue belongs in another repo (vision/text/audio/RL/ExecuTorch/etc.), transfer the issue and STOP.
PT2 is NOT a redirect. oncall: pt2 is not like the other oncall labels in Step 3. PT2 issues continue through Steps 4–7 for full triage — add oncall: pt2, then proceed to label with module: labels, mark triaged, etc.
Every oncall: pt2 issue MUST have at least one module: label. The PT2 oncall queue is too broad without a module label — the team needs to know which component is affected (e.g., module: dynamo, module: inductor, module: helion, module: dynamic shapes). If you cannot determine the specific module, use module: compile ux as a fallback, but always try to be specific first. See pt2-triage-rubric.md for detailed guidance.
CRITICAL: When redirecting issues to a non-PT2 oncall queue, apply exactly one oncall: ... label and STOP. Do NOT:
module: labelstriagedThe sub-oncall team will handle their own triage. Your job is only to route it to them.
| Label | When to use |
|---|---|
oncall: jit | TorchScript issues |
oncall: distributed | Distributed training (DDP, FSDP, RPC, c10d, DTensor, DeviceMesh, symmetric memory, context parallel, pipelining) |
oncall: export | torch.export issues |
oncall: quantization | Quantization issues |
oncall: mobile | Mobile (iOS/Android), excludes ExecuTorch |
oncall: profiler | Profiler issues (CPU, GPU, Kineto) |
oncall: visualization | TensorBoard integration |
Common routing mistakes to avoid:
oncall: mobile. MPS issues stay in the general queue with module: mps.oncall: distributed. DTensor issues should always be routed to oncall: distributed, even if they don't mention DDP/FSDP.module: onnx. There is no oncall: onnx. Use module: onnx and keep in the general queue.module: ci. Do not use oncall: releng. Use module: ci for CI infrastructure issues.torch.compile mishandles a distributed op (e.g., dist.all_reduce), the issue typically needs BOTH oncall: pt2 and oncall: distributed since the fix may span both codebases.Note: oncall: cpu inductor is a sub-queue of PT2. For general triage, just use oncall: pt2.
Only if the issue stays in the general queue:
module: ... labels based on the affected arealabels.json descriptions for guidance on when a specific label supersedes a general one (e.g., module: sdpa instead of module: nn for SDPA issues, module: flex attention instead of module: nn for flex attention).feature — wholly new functionality that does not exist today in any formenhancement — improvement to something that already works (e.g., adding a native backend kernel for an op that already runs via fallback/composite, performance optimization, better error messages). If the enhancement is about performance, also add module: performance.function request — a new function or new arguments/modes for an existing functionenhancement, not featureCommonly missed labels — always check for these:
| Condition | Label |
|---|---|
| Segfault, illegal memory access, SIGSEGV | module: crash |
| Performance issue: regression, slowdown, or optimization request | module: performance |
| Issue on Windows | module: windows |
| Previously working feature now broken | module: regression |
| Broken docs/links that previously worked | module: docs + module: regression (NOT enhancement) |
| Issue about a test failing (not the underlying functionality) | module: tests |
| Backward pass / gradient computation bug | module: autograd (in addition to the op's module label) |
torch.linalg ops or linear algebra ops (solve, svd, eig, inv, etc.) | module: linear algebra |
has workaround | Only add when the workaround is non-trivial and non-obvious. If the issue is "X doesn't work for non-contiguous tensors," calling .contiguous() is the tautological inverse of the bug, not a workaround. A real workaround is something like installing a specific package version, adding a synchronization point, inserting gc.collect(), or using a different API that isn't obviously implied by the bug description. |
Label based on the actual bug, not keywords. Read the issue to understand what is actually broken. A bug about broadcasting that happens to mention "nan" in a parameter name is a frontend bug, not a NaN/Inf bug.
CRITICAL: If you believe an issue is high priority, you MUST:
triage review label and do not add triagedDo NOT directly add high priority without human confirmation.
High priority criteria:
The bot-triaged label is automatically applied by a post-hook after any issue mutation. You do not need to add it manually.
If not transferred/redirected and not flagged for review, add triaged.
DO NOT:
high priority directly without human confirmationDO:
triage review for human attentionfeature, enhancement, function request) when confidenttriaged label when classification is completeNote: bot-triaged is automatically applied by a post-hook after any issue mutation.