Name: Compbio Workflow
Author: gabenavarro

Computational Biology Workflow

Patterns for bioinformatics analysis, single-cell genomics, protein modeling, and computational biology.

When to Use This Skill

Building or running single-cell RNA-seq analysis pipelines
Processing NGS data (FASTQ → BAM → VCF)
Protein structure prediction or embedding analysis
Metabolic modeling and pathway analysis
Cheminformatics and molecular analysis

Single-Cell RNA-seq Pipeline (scanpy)

Standard Workflow

import scanpy as sc

# 1. Load data
adata = sc.read_10x_h5("filtered_feature_bc_matrix.h5")
adata.var_names_make_unique()

# 2. Quality control
sc.pp.filter_cells(adata, min_genes=200)
sc.pp.filter_genes(adata, min_cells=3)
adata.var["mt"] = adata.var_names.str.startswith("MT-")
sc.pp.calculate_qc_metrics(adata, qc_vars=["mt"], inplace=True)
adata = adata[adata.obs.pct_counts_mt < 20].copy()

# 3. Normalize and log-transform
sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)

# 4. Feature selection
sc.pp.highly_variable_genes(adata, n_top_genes=2000)

# 5. Dimensionality reduction
sc.pp.pca(adata, n_comps=50)
sc.pp.neighbors(adata, n_neighbors=15, n_pcs=30)
sc.tl.umap(adata)

# 6. Clustering
sc.tl.leiden(adata, resolution=0.5)

# 7. Differential expression
sc.tl.rank_genes_groups(adata, groupby="leiden", method="wilcoxon")

# 8. Save checkpoint
adata.write("results/processed.h5ad")

Computational Biology Workflow

Patterns for bioinformatics analysis, single-cell genomics, protein modeling, and computational biology.

When to Use This Skill

Building or running single-cell RNA-seq analysis pipelines
Processing NGS data (FASTQ → BAM → VCF)
Protein structure prediction or embedding analysis
Metabolic modeling and pathway analysis
Cheminformatics and molecular analysis

Single-Cell RNA-seq Pipeline (scanpy)

Standard Workflow

import scanpy as sc

# 1. Load data
adata = sc.read_10x_h5("filtered_feature_bc_matrix.h5")
adata.var_names_make_unique()

# 2. Quality control
sc.pp.filter_cells(adata, min_genes=200)
sc.pp.filter_genes(adata, min_cells=3)
adata.var["mt"] = adata.var_names.str.startswith("MT-")
sc.pp.calculate_qc_metrics(adata, qc_vars=["mt"], inplace=True)
adata = adata[adata.obs.pct_counts_mt < 20].copy()

# 3. Normalize and log-transform
sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)

# 4. Feature selection
sc.pp.highly_variable_genes(adata, n_top_genes=2000)

# 5. Dimensionality reduction
sc.pp.pca(adata, n_comps=50)
sc.pp.neighbors(adata, n_neighbors=15, n_pcs=30)
sc.tl.umap(adata)

# 6. Clustering
sc.tl.leiden(adata, resolution=0.5)

# 7. Differential expression
sc.tl.rank_genes_groups(adata, groupby="leiden", method="wilcoxon")

# 8. Save checkpoint
adata.write("results/processed.h5ad")

Compbio Workflow

Computational Biology Workflow

When to Use This Skill

Single-Cell RNA-seq Pipeline (scanpy)

Standard Workflow

Compbio Workflow

Computational Biology Workflow

When to Use This Skill

Single-Cell RNA-seq Pipeline (scanpy)

Standard Workflow

scvi-tools Integration

RNA Velocity (scvelo)

NGS Processing

FASTQ → BAM → VCF Pattern

Quick Database Queries (gget)

Protein Analysis

ESM Embeddings

Metabolic Modeling

Data Management

Best Practices

Nanoclaw Repl

Bioinformatics

Smart Explore

Vector Database Engineer

Skin Health Analyzer

Scanpy