Filter cells and genes from single-cell RNA-seq AnnData objects using QC-derived thresholds or tissue presets. This wrapper removes low-quality cells/genes but does not normalize, cluster, or annotate the dataset.
figures/.processed.h5ad, filter-stat tables, result JSON, README, and notebook artifacts.This skill currently exposes one public workflow: threshold_filtering.
This skill does:
This skill does not:
| Format | Extension / form | Current wrapper support | Notes |
|---|---|---|---|
| AnnData | .h5ad | yes | preferred input path |
| Demo | --demo | yes | bundled fallback |
.h5adn_genes_by_counts, total_counts, pct_counts_mtsc-qcmin_cells.processed.h5ad, figures, figure-data CSVs, summary tables, report.md, and result.json.python skills/singlecell/scrna/sc-filter/sc_filter.py \
--input <data.h5ad> --output <dir>
python skills/singlecell/scrna/sc-filter/sc_filter.py \
--input <data.h5ad> --tissue pbmc --output <dir>
python skills/singlecell/scrna/sc-filter/sc_filter.py \
--input <data.h5ad> --min-genes 200 --max-genes 6000 \
--max-mt-percent 15 --min-cells 3 --output <dir>
| Flag | Default | Meaning |
|---|---|---|
--min-genes | 200 | Minimum detected genes per retained cell |
--max-genes | none | Optional upper gene-count cap |
--min-counts | none | Optional lower UMI-count cap |
--max-counts | none | Optional upper UMI-count cap |
--max-mt-percent | 20.0 | Maximum mitochondrial percentage |
--min-cells | 3 | Minimum number of cells expressing a retained gene |
--tissue | none | OmicsClaw preset thresholds such as pbmc, brain, or tumor |
Current OmicsClaw threshold_filtering always:
min_cellsImportant implementation note:
tissue is an OmicsClaw wrapper preset, not an upstream Scanpy parameter.Successful runs write:
processed.h5adreport.mdresult.jsonfigures/tables/filter_stats.csvtables/filter_summary.csvtables/retention_summary.csvfigure_data/reproducibility/commands.shThe current wrapper writes a standard recipe-driven gallery:
figures/filter_comparison.pngfigures/filter_summary.pngreport.mdfigures/filter_comparison.pngtables/filter_stats.csvfigures/filter_summary.pngprocessed.h5adREADME.md and notebook-style reproducibility artifacts when notebook export dependencies are available.--tissue presets are used.predicted_doublet or doublet_score columns (written by sc-doublet-detection) are present in adata.obs. Pass --no-remove-doublets to disable. Use --doublet-score-threshold (default 0.25) to tune the score cutoff when only doublet_score is available.knowledge_base/knowhows/KH-sc-filter-guardrails.md.knowledge_base/skill-guides/singlecell/sc-filter.md.sc-qc for QC visualization (recommended), or directly from sc-count / sc-multi-countsc-preprocessing for normalization, HVG selection, and PCAsc-doublet-detection or sc-ambient-removal| Flag | Type | Default | Description | Validation |
|---|---|---|---|---|
--input | path | — | Input AnnData file (.h5ad); required unless --demo | — |
--output | path | — | Output directory (required) | — |
--demo | flag | false | Run with built-in demo data | — |
--min-genes | int | 200 | Minimum detected genes per retained cell | Must be >= 0; must be <= --max-genes when both set |
--max-genes | int | none | Maximum detected genes per cell (optional upper cap) | Must be >= 0 |
--min-counts | int | none | Minimum UMI counts per cell (optional lower cap) | Must be >= 0; must be <= --max-counts when both set |
--max-counts | int | none | Maximum UMI counts per cell (optional upper cap) | Must be >= 0 |
--max-mt-percent | float | 20.0 | Maximum mitochondrial percentage per cell | Must be in [0, 100] |
--min-cells | int | 3 | Minimum cells expressing a retained gene | Must be >= 0 |
--tissue | enum | none | OmicsClaw tissue preset: pbmc, brain, tumor, heart, kidney, liver, lung | — |
--no-remove-doublets | flag | off | Disable automatic doublet removal (doublets are removed by default when predicted_doublet / doublet_score columns are present) | — |
--doublet-score-threshold | float | 0.25 | Score cutoff when only doublet_score is available (requires upstream sc-doublet-detection) | Must be in [0, 1] |
--r-enhanced | flag | false | Generate R Enhanced figures via ggplot2 renderers | — |
| Renderer | Output file | What it shows | R packages |
|---|---|---|---|
plot_feature_violin | r_feature_violin.png | Violin plots of QC metrics (genes, counts, MT%) before/after filtering | ggplot2, ggridges, cowplot |
plot_cell_barplot | r_cell_barplot.png | Cell count bar chart showing retained vs removed cells | ggplot2, cowplot |