Master skill for generating immunopipe pipeline configurations. Determines pipeline architecture based on data type (scRNA-seq with or without scTCR/BCR-seq) and analysis requirements. Routes to individual process skills for detailed configuration. Use this skill when starting a new immunopipe configuration or modifying pipeline-level options.
Purpose: Master skill for generating immunopipe pipeline configurations. Routes to individual process skills and determines pipeline architecture based on analysis requirements.
Ask the user about their data:
Do you have scRNA-seq data?
Do you have scTCR-seq or scBCR-seq data?
Is your RNA data already processed in a Seurat object?
LoadingRNAFromSeurat instead of SampleInfo + SeuratPreparingSampleInfoAsk what analyses they want to perform:
| Goal | Required Processes | Routing |
|---|---|---|
| Basic clustering & visualization | SampleInfo, SeuratPreparing, SeuratClustering, SeuratClusterStats | Use sampleinfo, seuratpreparing, seuratclustering, seuratclusterstats skills |
| T/B cell selection | Add TOrBCellSelection | Use torbcellselection skill |
| Cell type annotation | Add CellTypeAnnotation or SeuratMap2Ref | Use celltypeannotation or seuratmap2ref skills |
| Marker finding | Add ClusterMarkers or MarkersFinder | Use clustermarkers or markersfinder skills |
| TCR clonotype analysis | Add CDR3Clustering, TESSA, ClonalStats | Use cdr3clustering, tessa, clonalstats skills |
| Cell-cell communication | Add CellCellCommunication | Use cellcellcommunication skill |
| Pathway enrichment | Add ScFGSEA | Use scfgsea skill |
| Metabolic analysis | Add ScrnaMetabolicLandscape | Use scrnametaboliclandscape skill |
| Differential expression | Add PseudoBulkDEG | Use pseudobulkdeg skill |
Essential Processes (always needed for TCR route):
SampleInfo (or LoadingRNAFromSeurat)ScRepLoading (if TCR/BCR data present)SeuratPreparing (unless loading from prepared Seurat object)SeuratClusteringSeuratClusterStatsEssential Processes (RNA-only route):
SampleInfo (or LoadingRNAFromSeurat)SeuratPreparingSeuratClusteringSeuratClusterStatsOptional Processes (enable only if requested):
TOrBCellSelection - T/B cell separationSeuratClusteringOfAllCells - Clustering before T/B selectionClusterMarkersOfAllCells - Markers before T/B selectionTopExpressingGenesOfAllCells - Top genes before T/B selectionCellTypeAnnotation - Automated cell type annotationSeuratMap2Ref - Reference-based annotationSeuratSubClustering - Sub-clustering analysisClusterMarkers - Differential expression between clustersTopExpressingGenes - Top expressed genes per clusterMarkersFinder - Flexible marker findingModuleScoreCalculator - Module/pathway scoringScRepCombiningExpression - TCR + RNA integrationCDR3Clustering - TCR CDR3 clusteringTESSA - TCR-specific analysisCDR3AAPhyschem - CDR3 physicochemical propertiesClonalStats - Clonality statisticsCellCellCommunication - Ligand-receptor analysisCellCellCommunicationPlots - Communication plotsScFGSEA - Fast gene set enrichmentPseudoBulkDEG - Pseudo-bulk differential expressionScrnaMetabolicLandscape - Comprehensive metabolic analysisname = "my_pipeline" # Pipeline name (affects workdir and outdir)
outdir = "./output" # Output directory (default: ./<name>-output)
loglevel = "info" # Logging level: debug, info, warning, error
forks = 4 # Number of parallel jobs (adjust based on CPU cores)
cache = true # Enable caching (recommended)
error_strategy = "halt" # halt, ignore, or retry
num_retries = 3 # Number of retries if error_strategy = "retry"
Local execution (default):
scheduler = "local"
SLURM cluster:
scheduler = "slurm"
[scheduler_opts]
qsub_opts = "-p general -q general -N {job.name} -t {job.index}"
SGE cluster:
scheduler = "sge"
[scheduler_opts]
qsub_opts = "-V -cwd -j yes"
Google Cloud Batch:
# Use: immunopipe gbatch instead of immunopipe
# See gbatch skill for configuration
[plugin_opts.report]
filters = ["name:Filter"] # Filter processes in report
[plugin_opts.runinfo]
# Runinfo plugin enabled by default
When user needs specific process configuration, route to the appropriate skill:
sampleinfo skillloadingrnafromseurat skillscreploading skillseuratpreparing skillseuratclustering skillseuratclusteringofallcells skillseuratsubclustering skilltorbcellselection skillcelltypeannotation skillseuratmap2ref skillclustermarkers skillclustermarkersofallcells skillmarkersfinder skilltopexpressinggenes skilltopexpressinggenesofallcells skillscrepcombiningexpression skillcdr3clustering skilltessa skillcdr3aaphyschem skillclonalstats skillmodulescorecalculator skillcellcellcommunication skillcellcellcommunicationplots skillseuratclusterstats skillscfgsea skillpseudobulkdeg skillscrnametaboliclandscape skillA complete TOML configuration file has three sections:
# 1. PIPELINE-LEVEL OPTIONS
name = "my_pipeline"
outdir = "./output"
forks = 4
# 2. PROCESS-LEVEL OPTIONS
[ProcessName]
cache = true
forks = 2 # Override pipeline-level forks for this process
[ProcessName.in]
# Input files specification
[ProcessName.envs]
# Environment variables (process parameters)
# 3. GOOGLE BATCH OPTIONS (if using immunopipe gbatch)
[cli-gbatch]
project = "my-gcp-project"
region = "us-central1"
User request: "I have scRNA-seq and scTCR-seq data. I want basic analysis with T cell selection."
Response:
SampleInfo, ScRepLoading, SeuratPreparing, SeuratClustering, SeuratClusterStatsSeuratClusteringOfAllCells, TOrBCellSelectionsampleinfo skill to configure input filesMinimal config:
name = "tcr_analysis"
forks = 4
[SampleInfo.in]
infile = ["sample_info.txt"]
[SeuratClusteringOfAllCells]
[TOrBCellSelection]
User request: "RNA-only data. I need clustering, cell type annotation, marker finding, and pathway enrichment."
Response:
SampleInfo, SeuratPreparing, SeuratClustering, SeuratClusterStatsCellTypeAnnotation, ClusterMarkers, ScFGSEAUser request: "I already have a processed Seurat object. I want to run TCR analysis."
Response:
LoadingRNAFromSeurat instead of SampleInfo + SeuratPreparingScRepLoading, SeuratClustering, etc.prepared = true in LoadingRNAFromSeurat to skip preprocessingSome processes have dependencies:
ScRepCombiningExpression requires both ScRepLoading and RNA inputClusterMarkers requires SeuratClusteringTOrBCellSelection usually follows SeuratClusteringOfAllCellsCellCellCommunication requires clustering to be completeSampleInfo OR LoadingRNAFromSeurat as entry point (not both)TOrBCellSelection, typically enable SeuratClusteringOfAllCells firstCellTypeAnnotation and SeuratMap2Ref serve similar purposes (can use both, but one usually sufficient)cache = "force" at pipeline level to reuse all previous resultscache = false for specific process to force re-runAfter generating configuration, validate with:
python -m immunopipe.validate_config config.toml
When process options reference external packages, expand them:
Seurat::FunctionName, check: https://satijalab.org/seurat/reference/FindMarkers(), FindClusters(), SCTransform(), RunUMAP()bar → BarPlot, box → BoxPlotPseudoBulkDEG, design formulas use DESeq2 syntaxScFGSEA, GMT files from MSigDBCellCellCommunication, CellChat databasesFor quick starts, use these templates:
basic-tcr template skillbasic-rna template skilladvanced-tcr template skillmetabolic template skillcommunication template skillCommon configuration errors to avoid:
[ProcessName.in] for entry processesScRepLoadingTOrBCellSelectionforks based on available CPU/memoryAfter generating config:
.toml file (e.g., config.toml)immunopipe config.tomlpipen board @config.tomlimmunopipe gbatch config.tomlFor modifications, route to specific process skills based on what needs to change.