Generate, refactor, or review bioinformatics code -- R scripts, Python modules, Nextflow/Snakemake steps, shell wrappers, SLURM jobs for single-cell and spatial multi-omics. Trigger on: "write a script", "build a module", "implement this step", "create a Nextflow process", "write the R code for", "add QC to this pipeline", "refactor this function", "scaffold this workflow", "write a SLURM script", "create an ArchR analysis", "build the Seurat/SnapATAC2 step", "write the chromVAR analysis", "implement peak calling", "code the motif enrichment", "write the integration step", or any request to produce bioinformatics code. Also trigger when the user pastes bioinformatics code asking for improvements or review. Handles implementation only; use architecture skill for design decisions, debug skill for errors.
Generate correct, modular, reproducible bioinformatics code. This is the
implementation arm -- it writes code. Design -> architecture skill;
errors -> debug skill.
All code style, anti-hallucination, output template, and Definition of Done rules from CLAUDE.md apply without exception. This skill does not restate them. The only addition: before writing framework code, read the reference file first.
| Type | Signal | Behavior |
|---|---|---|
| New module | "write", "build", "create", "implement", "scaffold" | Full output: header, code, QC checkpoint, Architect Notes |
| Extend existing | "add", "integrate", "hook up", "connect" | New code + integration instructions only |
| Refactor | "clean up", "modularize", "refactor", "improve" | Before/after with rationale per change |
| Review | "review", "check", "audit this code" | Structured: correctness -> style -> performance -> reproducibility |
| Quick snippet | "how do I...", "what's the syntax for..." | Minimal code block, no ceremony -- still follow style rules |
Before non-trivial code, confirm or infer:
If the user provided enough context, extract rather than re-ask.
When generating non-trivial code, always think one scale tier ahead even if the user did not explicitly ask for optimization. You do not need to implement the optimization directly unless requested, but you must add a short scale-up note.
That note must state current vs next-tier scale, identify the likely OOM and CPU bottleneck steps, and recommend 2-4 concrete mitigations (chunking,sparse/on-disk, pseudobulk, per-sample fanout, stricter filtering).
If no numeric threshold can be defended, say so explicitly and still name the likely failure point and mitigation path.
Read the reference file before generating framework-specific code:
| Framework | Reference file |
|---|---|
| ArchR | references/ArchR |
| Seurat | references/seurat-patterns.md (if available) |
| SnapATAC2 | references/snapatac2-patterns.md (if available) |
| Nextflow DSL2 | references/nextflow-dsl2.md (if available) |
| Snakemake | references/snakemake.md (if available) |
| SLURM / HPC | references/slurm-hpc.md (if available) |
If the reference file does not exist, proceed but flag higher uncertainty. The reference file is the source of truth. If memory disagrees, the reference wins.
Apply this for request types New module and Extend existing. Skip for Review and Quick snippet.
# Module: <name>
# Purpose: <one line>
# Input: <path | format>
# Output: <path | format>
# Deps: <key packages>
# Date: <YYYY-MM-DD>
<implementation>
# QC: <what to inspect> | Expected: <count/range> | Risk: <edge case>
# Scale: <current N -> next tier> | OOM risk: <step> | Fix: <mitigation>
# Notes: <portability or limitation one-liners>
These patterns are NOT in CLAUDE.md and are specific to this skill:
proj@projectMetadata$GroupCoverages <- NULLFindTransferAnchors() under the hoodgroupList mapping ATAC -> RNA clustersaddTrajectory() -- requires ordered group specificationgetMonocleTrajectories() -> addMonocleTrajectory() (two-step)addSlingShotTrajectories() (single-step)If Notion/Drive/PubMed/bioRxiv MCP tools are available, search for existing pipeline docs, scripts, or publications before writing new code or implementing a paper's method.