Local Scanpy pipeline for single-cell RNA-seq QC, clustering, marker discovery, and optional two-group differential expression from raw-count .h5ad.
You are scRNA Orchestrator, a specialised ClawBio agent for local single-cell RNA-seq analysis with Scanpy.
Single-cell workflows are easy to misconfigure and hard to reproduce when run ad hoc.
report.md, figures, tables, and reproducibility bundle.log1p, and HVG selection.obs column.--de-volcano.| Format | Extension | Required Fields | Example |
|---|---|---|---|
| AnnData raw counts | .h5ad | Raw count matrix in X; cell metadata in obs; gene metadata in var | pbmc_raw.h5ad |
| Demo mode | n/a | none | python clawbio.py run scrna --demo |
Notes:
.h5ad inputs are rejected with an actionable error.pbmc3k_processed-style inputs are out of scope for this skill.When the user asks for scRNA QC/clustering/markers/DE:
.h5ad input (or --demo), and reject processed-like matrices.leiden, Wilcoxon).--de-groupby --de-group1 --de-group2 are all provided.report.md, result.json, tables, figures, and reproducibility bundle.# Standard usage
python skills/scrna-orchestrator/scrna_orchestrator.py \
--input <input.h5ad> --output <report_dir>
# Demo mode
python skills/scrna-orchestrator/scrna_orchestrator.py \
--demo --output <report_dir>
# Optional two-group DE
python skills/scrna-orchestrator/scrna_orchestrator.py \
--input <input.h5ad> --output <report_dir> \
--de-groupby <obs_column> --de-group1 <group_a> --de-group2 <group_b>
# Optional DE volcano plot
python skills/scrna-orchestrator/scrna_orchestrator.py \
--input <input.h5ad> --output <report_dir> \
--de-groupby <obs_column> --de-group1 <group_a> --de-group2 <group_b> \
--de-volcano
# Via ClawBio runner
python clawbio.py run scrna --input <input.h5ad> --output <report_dir>
python clawbio.py run scrna --demo
python clawbio.py run scrna --demo
Expected output:
report.md with QC, clustering, and marker summariesqc_violin.png, umap_leiden.png, marker_dotplot.png)de_volcano.png) when --de-volcano is setn_genes_by_counts, total_counts, pct_counts_mt)min_genes, min_cells, max_mt_pct1e4log1pflavor="seurat")max_value=10)scanpy.tl.rank_genes_groups(groupby="leiden", method="wilcoxon", pts=True)scanpy.tl.rank_genes_groups(groupby=<de_groupby>, groups=[group1], reference=group2, method="wilcoxon", pts=True)logfoldchanges vs -log10(pvals_adj) (fallback to pvals if needed)p < 0.05 and |log2FC| >= 1output_directory/
├── report.md
├── result.json
├── figures/
│ ├── qc_violin.png
│ ├── umap_leiden.png
│ ├── marker_dotplot.png
│ └── de_volcano.png # only when DE volcano is enabled
├── tables/
│ ├── cluster_summary.csv
│ ├── markers_top.csv
│ ├── markers_top.tsv
│ ├── de_full.csv # only when DE is enabled
│ └── de_top.csv # only when DE is enabled
└── reproducibility/
├── commands.sh
├── environment.yml
└── checksums.sha256
Required:
scanpy >= 1.10anndata >= 0.10numpy, pandas, matplotlib, leidenalg, python-igraphOptional (future):
celltypist (cell-type annotation)scvi-tools (deep generative modeling)Trigger conditions:
.h5adCurrent limitations:
.h5ad only