Default scRNA counting route. Turn FASTQ or existing Cell Ranger, STARsolo, SimpleAF / Alevin-fry, or kb-python outputs into a downstream-ready standardized AnnData.
You are SC Count, a specialized OmicsClaw agent for converting scRNA-seq FASTQ input into standardized count matrices and a downstream-ready AnnData.
h5ad, basic count QC
figures, and preserves backend artifacts.processed.h5adlayers['counts']README.md, report.md, result.json,
and rerun commands.| Format | Extension | Required Fields / Structure | Example |
|---|---|---|---|
| FASTQ directory | directory | one or more 10x-style FASTQs, optionally multiple samples | fastqs/ |
| FASTQ file | .fastq.gz | one mate file plus --read2 for paired input | PBMC_R1.fastq.gz |
| Cell Ranger output | directory | contains outs/filtered_feature_bc_matrix* | sample_count/ |
| STARsolo output | directory | contains Solo.out/Gene/filtered/ | starsolo_pbmc/ |
| Demo | n/a | --demo flag | built-in PBMC example |
| Requirement | Where it should exist | Why it matters |
|---|---|---|
| Raw FASTQs or existing count output | input path | counting requires raw reads or a backend output directory to import |
| Backend reference | --reference | required for real backend runs |
| 10x chemistry contract | --chemistry, --whitelist | required for STARsolo barcode/UMI parsing |
| Transcript-to-gene map | --t2g | required for kb-python runs |
If the user is missing reference assets, guide them to either:
--reference, --whitelist, or --t2gresources/singlecell/references/cellranger/resources/singlecell/references/starsolo/resources/singlecell/references/simpleaf/resources/singlecell/references/kb/resources/singlecell/references/whitelists/Download guidance:
3M-february-2018.txtt2gREADME.md, report.md, result.json, and the reproducibility bundle.oc run sc-count --input fastqs/ --method cellranger --reference /path/to/refdata-gex-GRCh38-2020-A --output results/
oc run sc-count --input fastqs/ --method starsolo --reference /path/to/star_index --chemistry 10xv3 --whitelist /path/to/3M-february-2018.txt --output results/
oc run sc-count --input fastqs/ --method simpleaf --reference /path/to/simpleaf_index --chemistry 10xv3 --output results/
oc run sc-count --input fastqs/ --method kb_python --reference /path/to/kallisto.idx --t2g /path/to/t2g.txt --chemistry 10xv3 --output results/
oc run sc-count --input sample_count/ --method cellranger --output results/
python skills/singlecell/scrna/sc-count/sc_count.py --demo --output /tmp/sc_count_demo
cellranger count with --nosecondary and BAM output kept on.layers['counts'], stabilize names, and record OmicsClaw input-contract metadata.CB_UMI_Simple wrapper with EmptyDrops-style cell calling.Solo.out/Gene/filtered.layers['counts'] and record the same OmicsClaw contract as the Cell Ranger path.simpleaf quant when needed.kb count when needed.Key parameters:
| Parameter | Default | Description |
|---|---|---|
--method | cellranger | choose Cell Ranger, STARsolo, SimpleAF / Alevin-fry, or kb-python |
--reference | none | backend reference directory or index path |
--t2g | none | transcript-to-gene map for kb-python |
--sample | none | choose one sample from a multi-sample FASTQ directory |
--threads | 8 | thread count for backend execution |
--chemistry | auto | Cell Ranger auto-detects; STARsolo requires explicit supported chemistry |
--whitelist | none | STARsolo barcode whitelist file |
Current OmicsClaw behavior: all non-Cell Ranger methods are intentionally scoped to mainstream 10x-style droplet geometry first. Complex custom chemistries remain deferred.
figure_data/.figures/manifest.json records the standard counting gallery.output_directory/
├── README.md
├── report.md
├── result.json
├── processed.h5ad
├── figures/
│ ├── barcode_rank.png
│ └── count_distributions.png
│ ├── count_complexity_scatter.png
│ └── manifest.json
├── tables/
│ ├── count_summary.csv
│ ├── barcode_metrics.csv
│ └── backend_summary.csv
├── figure_data/
│ ├── manifest.json
│ ├── count_summary.csv
│ └── barcode_metrics.csv
├── artifacts/
│ ├── cellranger/
│ ├── starsolo/
│ ├── simpleaf/
│ └── kb_python/
└── reproducibility/
├── analysis_notebook.ipynb
├── commands.sh
└── requirements.txt
artifacts/ for methods
such as CellBender or RNA-velocity preparation.knowledge_base/knowhows/KH-sc-count-guardrails.md: short execution guardrails for import-vs-run decisions and backend boundaries.knowledge_base/skill-guides/singlecell/sc-count.md: longer operator guide for count-method choice and downstream hand-off.Required:
scanpy, anndata, pandas, numpyOptional but method-specific:
cellrangerSTARsimpleafkbpip install -e ".[singlecell-upstream]"cellrangerSTARsimpleafkbRecommended user guidance:
resources/singlecell/references/... or accept explicit local paths, but should not silently download large references during execution.| Flag | Type | Default | Description | Validation |
|---|---|---|---|---|
--input | str | None | FASTQ path or existing Cell Ranger / STARsolo output directory | Required unless --demo |
--output | str | — | Output directory | Required |
--demo | flag | off | Run with built-in demo data | — |
--method | str | cellranger | Counting backend | Choices: cellranger, starsolo, simpleaf, kb_python |
--reference | str | None | Backend reference path (Cell Ranger transcriptome, STAR genome dir, simpleaf index, or kallisto index) | Required for real backend runs |
--t2g | str | None | Transcript-to-gene map for kb-python runs | Required for kb_python method |
--sample | str | None | Choose one sample from a multi-sample FASTQ directory | — |
--read2 | str | None | Explicit mate FASTQ when --input points to one file | — |
--threads | int | 8 | Backend thread count | — |
--chemistry | str | auto | Chemistry hint; STARsolo supports 10xv2, 10xv3, 10xv4; Cell Ranger auto-detects | — |
--whitelist | str | None | STARsolo barcode whitelist file | Strongly recommended for STARsolo real runs |
--r-enhanced | flag | off | Accepted for CLI consistency; no R Enhanced plots for this skill | No-op |
This skill has no R Enhanced plots. The --r-enhanced flag is accepted for CLI consistency but produces no additional output.
Each backend requires its own reference assets. OmicsClaw does not download references automatically.
| Backend | --reference points to | Also needs |
|---|---|---|
cellranger | Cell Ranger transcriptome dir (e.g., refdata-gex-GRCh38-2020-A) | — |
starsolo | STAR genome directory (built from FASTA + GTF) | --whitelist (10x barcode list), --chemistry |
simpleaf | simpleaf index directory | --chemistry |
kb_python | kallisto index file (.idx) | --t2g (transcript-to-gene map), --chemistry |
Place reference assets under resources/singlecell/references/<backend>/ for auto-detection, or pass explicit paths via --reference.
If a Cell Ranger or STARsolo run has already completed, pass the existing output directory as --input and omit --reference. The wrapper imports the filtered matrix directly without re-running the backend.
# Cell Ranger — run from FASTQ
python omicsclaw.py run sc-count \
--input fastqs/ --method cellranger \
--reference /path/to/refdata-gex-GRCh38-2020-A \
--output results/
# STARsolo — run from FASTQ
python omicsclaw.py run sc-count \
--input fastqs/ --method starsolo \
--reference /path/to/star_index \
--chemistry 10xv3 \
--whitelist /path/to/3M-february-2018.txt \
--output results/
# kb-python — run from FASTQ
python omicsclaw.py run sc-count \
--input fastqs/ --method kb_python \
--reference /path/to/kallisto.idx \
--t2g /path/to/t2g.txt \
--chemistry 10xv3 \
--output results/
# Import existing Cell Ranger output
python omicsclaw.py run sc-count \
--input sample_count/ --method cellranger \
--output results/
sc-fastq-qc for read-level quality assessment before countingsc-qc for quality assessment, or sc-multi-count if merging multiple samples