PopV Population-Level Cell Type Annotation

PopV (Population Voting) annotates cell types by running up to 10 classification algorithms and aggregating predictions via majority voting. Unlike single-method annotation (SCSA, MetaTiME, CellTypist alone), PopV produces a consensus prediction that is more robust to individual algorithm failures. The module also supports ontology-aware voting via the Cell Ontology (CL) for hierarchical label resolution.

Defensive Validation

# Before PopV: verify reference has the cell type column
assert ref_labels_key in ref_adata.obs.columns, \
    f"ref_adata.obs['{ref_labels_key}'] not found. Available: {list(ref_adata.obs.columns)}"

# Verify no NaN in reference labels
assert ref_adata.obs[ref_labels_key].notna().all(), \
    f"NaN values in ref_adata.obs['{ref_labels_key}']. Use fillna() or drop these cells."

# Verify gene overlap
overlap = query_adata.var_names.intersection(ref_adata.var_names)
assert len(overlap) > 100, \
    f"Only {len(overlap)} overlapping genes between query and reference. Check var_names format (ENSEMBL vs symbol)."

PopV Population-Level Cell Type Annotation

Defensive Validation

# Before PopV: verify reference has the cell type column
assert ref_labels_key in ref_adata.obs.columns, \
    f"ref_adata.obs['{ref_labels_key}'] not found. Available: {list(ref_adata.obs.columns)}"

# Verify no NaN in reference labels
assert ref_adata.obs[ref_labels_key].notna().all(), \
    f"NaN values in ref_adata.obs['{ref_labels_key}']. Use fillna() or drop these cells."

# Verify gene overlap
overlap = query_adata.var_names.intersection(ref_adata.var_names)
assert len(overlap) > 100, \
    f"Only {len(overlap)} overlapping genes between query and reference. Check var_names format (ENSEMBL vs symbol)."

Algorithm	Result Key	Type	Speed
`KNN_SCVI`	`popv_knn_on_scvi_prediction`	Deep learning + KNN	Medium
`SCANVI_POPV`	`popv_scanvi_prediction`	Semi-supervised DL	Medium
`CELLTYPIST`	`popv_celltypist_prediction`	Logistic regression	Fast
`ONCLASS`	`popv_onclass_prediction`	Ontology-guided	Medium
`Support_Vector`	`popv_svm_prediction`	SVM	Fast
`XGboost`	`popv_xgboost_prediction`	Gradient boosting	Fast
`KNN_HARMONY`	`popv_knn_harmony_prediction`	Harmony + KNN	Fast
`KNN_BBKNN`	`popv_knn_bbknn_prediction`	BBKNN + KNN	Fast
`Random_Forest`	`popv_rf_prediction`	Random forest	Fast
`KNN_SCANORAMA`	`popv_knn_scanorama_prediction`	Scanorama + KNN	Medium

Column	Description
`popv_majority_vote_prediction`	Majority vote across all methods
`popv_majority_vote_score`	Number of agreeing methods
`popv_prediction`	Ontology-aggregated consensus (if CL enabled)
`popv_prediction_score`	Ontology consensus score

Single Popv Annotation

PopV Population-Level Cell Type Annotation

Defensive Validation

Single Popv Annotation

PopV Population-Level Cell Type Annotation

Defensive Validation

Stage 1: Data Preparation

Stage 2: Annotation

Available Algorithms (10 total)

Selecting Specific Methods

Stage 3: Consensus Results & Visualization

Stage 4: Pretrained Hub Models (Optional)

Critical API Reference

GPU Acceleration

Troubleshooting

Dependencies

Examples

References

Continuous Learning V2

Continuous Learning V2

Continuous Learning V2

Continuous Learning

Continuous Learning

Pytorch Patterns