Name: Gene Enrichment and Pathway Analysis
Author: Nam4802

Gene Enrichment and Pathway Analysis

Perform comprehensive gene enrichment and pathway analysis using gseapy (ORA and GSEA), PANTHER, STRING, Reactome, and 40+ ToolUniverse tools. Supports GO enrichment (BP, MF, CC), KEGG, Reactome, WikiPathways, MSigDB Hallmark, and 220+ Enrichr libraries. Handles multiple ID types (gene symbols, Ensembl, Entrez, UniProt), multiple organisms (human, mouse, rat, fly, worm, yeast), customizable backgrounds, and multiple testing correction (BH, Bonferroni). Use when users ask about gene enrichment, pathway analysis, GO term enrichment, KEGG pathway analysis, GSEA, over-representation analysis, functional annotation, or gene set analysis.

Nam48020 スター2026/03/26

職業
カテゴリ: バイオインフォマティクス

Perform comprehensive gene enrichment analysis including Gene Ontology (GO), KEGG, Reactome, WikiPathways, and MSigDB enrichment using both Over-Representation Analysis (ORA) and Gene Set Enrichment Analysis (GSEA). Integrates local computation via gseapy with ToolUniverse pathway databases for cross-validated, publication-ready results.

IMPORTANT: Always use English terms in tool calls (gene names, pathway names, organism names), even if the user writes in another language. Only try original-language terms as a fallback if English returns no results. Respond in the user's language.

When to Use This Skill

Apply when users:

Ask about gene enrichment analysis (GO, KEGG, Reactome, etc.)
Have a gene list from differential expression, clustering, or any experiment
Want to know which biological processes, molecular functions, or cellular components are enriched
Need KEGG or Reactome pathway enrichment analysis
Ask about GSEA (Gene Set Enrichment Analysis) with ranked gene lists
Want over-representation analysis (ORA) with Fisher's exact test
Need multiple testing correction (Benjamini-Hochberg, Bonferroni)
Ask about enrichGO, gseapy, clusterProfiler-style analyses

Gene Enrichment and Pathway Analysis

Nam48020 スター2026/03/26

職業
カテゴリ: バイオインフォマティクス

When to Use This Skill

Apply when users:

Ask about gene enrichment analysis (GO, KEGG, Reactome, etc.)

Have a gene list from differential expression, clustering, or any experiment

Want to know which biological processes, molecular functions, or cellular components are enriched

Need KEGG or Reactome pathway enrichment analysis

Ask about GSEA (Gene Set Enrichment Analysis) with ranked gene lists

Want over-representation analysis (ORA) with Fisher's exact test

Need multiple testing correction (Benjamini-Hochberg, Bonferroni)

Ask about enrichGO, gseapy, clusterProfiler-style analyses

Parameter	Required	Description	Example
gene_list	Yes	List of gene symbols, Ensembl IDs, or Entrez IDs	`["TP53", "BRCA1", "EGFR"]`
organism	No	Organism (default: human). Supported: human, mouse, rat, fly, worm, yeast, zebrafish	`human`
analysis_type	No	`ORA` (default) or `GSEA`	`ORA`
enrichment_databases	No	Which databases to query. Default: all applicable	`["GO_BP", "GO_MF", "GO_CC", "KEGG", "Reactome"]`
gene_id_type	No	Input ID type: `symbol`, `ensembl`, `entrez`, `uniprot` (auto-detected if omitted)	`symbol`
p_value_cutoff	No	Significance threshold (default: 0.05)	`0.05`
correction_method	No	Multiple testing: `BH` (Benjamini-Hochberg, default), `bonferroni`, `fdr`	`BH`
background_genes	No	Custom background gene set (default: genome-wide)	`["GENE1", "GENE2", ...]`
ranked_gene_list	No	For GSEA: gene-to-score mapping (e.g., log2FC)	`{"TP53": 2.5, "BRCA1": -1.3, ...}`

Tier	Symbol	Criteria	Examples
T1	[T1]	Curated/experimental enrichment	PANTHER, Reactome Analysis Service
T2	[T2]	Computational enrichment, well-validated	gseapy ORA/GSEA, STRING functional enrichment
T3	[T3]	Text-mining/predicted enrichment	Enrichr non-curated libraries
T4	[T4]	Single-source annotation	Individual gene GO annotations from QuickGO

Organism	Taxonomy ID	gseapy	PANTHER	STRING	Reactome
Human	9606	Yes	Yes	Yes	Yes
Mouse	10090	Yes (`*_Mouse`)	Yes	Yes	Yes (projection)
Rat	10116	Limited	Yes	Yes	Yes (projection)
Fly	7227	Limited	Yes	Yes	Yes (projection)
Worm	6239	Limited	Yes	Yes	Yes (projection)
Yeast	4932	Limited	Yes	Yes	Yes

Tool	Input	Output	Use For
`gseapy.enrichr()`	gene_list, gene_sets, organism	`.results` DataFrame	ORA with 225+ libraries
`gseapy.prerank()`	rnk (ranked Series), gene_sets	`.res2d` DataFrame	GSEA analysis

Tool	Key Parameters	Evidence Grade
`PANTHER_enrichment`	gene_list (comma-sep), organism, annotation_dataset	[T1]
`STRING_functional_enrichment`	protein_ids, species	[T2]
`ReactomeAnalysis_pathway_enrichment`	identifiers (space-sep), page_size	[T1]

Gene Enrichment and Pathway Analysis

When to Use This Skill

Gene Enrichment and Pathway Analysis

When to Use This Skill

Input Parameters

Core Principles

Decision Tree: ORA vs GSEA

Decision Tree: gseapy vs ToolUniverse Tools

Quick Start Workflow

Step 1: Create Report File (IMMEDIATE)

Step 2: ID Conversion and Validation

Step 3: Primary Enrichment with gseapy

Step 4: Cross-Validation with ToolUniverse

Step 5: Report Compilation

Evidence Grading

Supported Organisms

Common Patterns

Pattern 1: Standard DEG Enrichment (ORA)

Pattern 2: Ranked Gene List (GSEA)

Pattern 3: BixBench Enrichment Question

Pattern 4: Multi-Organism Enrichment

Troubleshooting

Tool Reference

Primary Enrichment Tools

Cross-Validation Tools

ID Conversion Tools

Detailed Documentation

Resources

Nanoclaw Repl

Bioinformatics

Smart Explore

Vector Database Engineer

Skin Health Analyzer

Scanpy

Tool	Input	Output
`MyGene_batch_query`	gene_ids, fields	Symbol, Entrez, Ensembl mappings
`STRING_map_identifiers`	protein_ids, species	Preferred names, STRING IDs