Comprehensive CRISPR screen analysis for functional genomics. Analyze pooled or arrayed CRISPR screens (knockout, activation, interference) to identify essential genes, synthetic lethal interactions, and drug targets. Perform sgRNA count processing, gene-level scoring (MAGeCK, BAGEL), quality control, pathway enrichment, and drug target prioritization. Use for CRISPR screen analysis, gene essentiality studies, synthetic lethality detection, functional genomics, drug target validation, or identifying genetic vulnerabilities.
Comprehensive skill for analyzing CRISPR-Cas9 genetic screens to identify essential genes, synthetic lethal interactions, and therapeutic targets through robust statistical analysis and pathway enrichment.
CRISPR screens enable genome-wide functional genomics by systematically perturbing genes and measuring fitness effects. This skill provides an 8-phase workflow for:
Load sgRNA count matrix (MAGeCK format or generic TSV). Expected columns: sgRNA, Gene, plus sample columns. Create experimental design table linking samples to conditions (baseline/treatment) with replicate assignments.
Assess sgRNA distribution quality:
Normalize sgRNA counts to account for library size differences:
Calculate log2 fold changes (LFC) between treatment and control conditions with pseudocount.
Two scoring approaches:
Compare essentiality scores between wildtype and mutant cell lines:
Query DepMap/literature for known dependencies using PubMed search.
Submit top essential genes to Enrichr for pathway enrichment:
Composite scoring combining:
Query DGIdb for each candidate gene to find existing drugs, interaction types, and sources.
Generate markdown report with:
Key Tools Used:
PubMed_search_articles - Literature search for gene essentiality and drug resistanceReactomeAnalysis_pathway_enrichment - Pathway enrichment (param: identifiers newline-separated, page_size)enrichr_gene_enrichment_analysis - Enrichr enrichment (param: gene_list array, libs array)DGIdb_get_drug_gene_interactions - Drug-gene interactions (param: genes as array)DGIdb_get_gene_druggability - Druggability categoriesSTRING_get_network - Protein interaction networkskegg_search_pathway - Pathway search by keywordkegg_get_pathway_info - Pathway details by IDCancer Context (essential for drug resistance screens):
civic_search_evidence_items - Clinical evidence for drug resistance/sensitivityCOSMIC_get_mutations_by_gene - Somatic mutation landscapecBioPortal_get_mutations - Mutations in specific cancer cohortsChEMBL_search_targets - Structural druggability assessmentExpression & Variant Integration:
GEO_search_rnaseq_datasets / geo_search_datasets - Expression datasetsClinVar_search_variants - Known pathogenic variantsgnomad_get_gene_constraints - Gene constraint metrics (pLI, oe_lof)UniProt_get_function_by_accession - Protein function for hit validationimport pandas as pd
from tooluniverse import ToolUniverse
# 1. Load data
counts, meta = load_sgrna_counts("sgrna_counts.txt")
design = create_design_matrix(['T0_1', 'T0_2', 'T14_1', 'T14_2'],
['baseline', 'baseline', 'treatment', 'treatment'])
# 2. Process
filtered_counts, filtered_mapping = filter_low_count_sgrnas(counts, meta['sgrna_to_gene'])
norm_counts, _ = normalize_counts(filtered_counts)
lfc, _, _ = calculate_lfc(norm_counts, design)
# 3. Score genes
gene_scores = mageck_gene_scoring(lfc, filtered_mapping)
# 4. Enrich pathways
enrichment = enrich_essential_genes(gene_scores, top_n=100)
# 5. Find drug targets
drug_targets = prioritize_drug_targets(gene_scores)
# 6. Generate report
report = generate_crispr_report(gene_scores, enrichment, drug_targets)
Screen hits are statistical findings, not direct readouts of biological relevance. A gene scoring as essential might be essential for cell growth in general (housekeeping) or essential specifically for the phenotype you are screening for (interesting). Always compare your screen hits to public essentiality data — use DepMap pan-cancer dependency scores to filter genes that are broadly essential across all cell lines. A gene essential only in your specific context, but not pan-essential in DepMap, is a better candidate for follow-up than one that scores in every screen.
LOOK UP DON'T GUESS: DepMap dependency scores, known core essential gene sets (Hart et al., Blomen et al.), and DGIdb druggability data for your top hits. Do not assume a hit is context-specific without checking public essentiality databases.
| Evidence Grade | Criteria | Validation Priority |
|---|---|---|
| A -- Strong hit | MAGeCK RRA p < 0.001, BAGEL BF > 5, >=3 sgRNAs with concordant LFC | Immediate validation (individual KO, growth assay) |
| B -- Moderate hit | MAGeCK RRA p < 0.01, BAGEL BF 2-5, >=2 concordant sgRNAs | Secondary validation pool |
| C -- Weak/ambiguous | p > 0.01, BF < 2, or discordant sgRNA effects | Deprioritize; check for copy-number bias or seed effects |
Interpreting screen results:
Synthesis questions to address in the report:
ANALYSIS_DETAILS.md - Detailed code snippets for all 8 phasesUSE_CASES.md - Complete use cases (essentiality screen, synthetic lethality, drug target discovery, expression integration) and best practicesEXAMPLES.md - Example usage and quick referenceQUICK_START.md - Quick start guideFALLBACK_PATCH.md - Fallback patterns for API issues