Name: Gsea Enrichment Analysis
Author: omicverse

GSEA and Pathway Enrichment Analysis

Overview

This skill covers gene set enrichment analysis (GSEA) and pathway enrichment workflows in OmicVerse. It provides critical guidance on the correct data formats and API usage patterns to avoid common errors.

Critical API Reference - Geneset Format

IMPORTANT: Use Dictionary Format, NOT File Path!

The ov.bulk.geneset_enrichment() function requires a dictionary of gene sets, NOT a file path string. You must first load the geneset file using ov.utils.geneset_prepare().

CORRECT usage:

# Step 1: Download pathway database (if not already available)
ov.utils.download_pathway_database()

# Step 2: Load geneset file into dictionary format - REQUIRED!
pathways_dict = ov.utils.geneset_prepare(
    'genesets/GO_Biological_Process_2021.txt',  # or .gmt file
    organism='Human'  # or 'Mouse'
)

# Step 3: Now run enrichment with the DICTIONARY
enr = ov.bulk.geneset_enrichment(
    gene_list=deg_genes,
    pathways_dict=pathways_dict,  # Pass the DICTIONARY, not file path!
    pvalue_type='auto',
    organism='Human'
)

GSEA and Pathway Enrichment Analysis

Overview

Critical API Reference - Geneset Format

IMPORTANT: Use Dictionary Format, NOT File Path!

The ov.bulk.geneset_enrichment() function requires a dictionary of gene sets, NOT a file path string. You must first load the geneset file using ov.utils.geneset_prepare().

CORRECT usage:

# Step 1: Download pathway database (if not already available)
ov.utils.download_pathway_database()

# Step 2: Load geneset file into dictionary format - REQUIRED!
pathways_dict = ov.utils.geneset_prepare(
    'genesets/GO_Biological_Process_2021.txt',  # or .gmt file
    organism='Human'  # or 'Mouse'
)

# Step 3: Now run enrichment with the DICTIONARY
enr = ov.bulk.geneset_enrichment(
    gene_list=deg_genes,
    pathways_dict=pathways_dict,  # Pass the DICTIONARY, not file path!
    pvalue_type='auto',
    organism='Human'
)

File Extension	Load Method	Notes
`.txt`	`ov.utils.geneset_prepare()`	OmicVerse format
`.gmt`	`ov.utils.geneset_prepare()`	Standard GMT format
`.json`	`json.load()` then convert	Custom handling needed

Gsea Enrichment Analysis

GSEA and Pathway Enrichment Analysis

Overview

Critical API Reference - Geneset Format

IMPORTANT: Use Dictionary Format, NOT File Path!

Gsea Enrichment Analysis

GSEA and Pathway Enrichment Analysis

Overview

Critical API Reference - Geneset Format

IMPORTANT: Use Dictionary Format, NOT File Path!

File Format Support

Complete Enrichment Workflow

Common Errors and Solutions

Error: "FileNotFoundError" or "pathways_dict is not a dict"

Error: "Missing file 'genesets/GO_Biological_Process_2021.gmt'"

Error: "No enriched pathways found"

Pathway Databases Available

Best Practices

Examples

References

Nanoclaw Repl

Bioinformatics

Smart Explore

Vector Database Engineer

Skin Health Analyzer

Scanpy