Name: Tooluniverse Cancer Genomics Tcga
Author: mims-harvard

Tooluniverse Cancer Genomics Tcga

TCGA/GDC cancer genomics analysis -- cohort construction, clinical metadata retrieval, somatic mutation profiling, copy number variation analysis, survival analysis, and clinical variant interpretation. Use when users ask about TCGA data, GDC cancer cohorts, somatic mutation frequencies, Kaplan-Meier survival, CNV profiles in cancer, or OncoKB interpretation of cancer variants.

mims-harvard1,271 starsMar 29, 2026

Occupation
Categories: Bioinformatics

Cancer Genomics / TCGA Analysis

TCGA analysis starts with: what cancer type? what data type? Build your cohort FIRST (GDC filters), then analyze. Don't query mutations without defining the cohort — pan-cancer counts from GDC_get_mutation_frequency are uninformative without cancer-type context. A mutation frequency of 10% in one cancer type may be 0.5% in another; always specify project_id. Survival analysis (Kaplan-Meier) is hypothesis-generating in retrospective TCGA data — always report sample size and p-value, and note that TCGA cohorts are not treatment-stratified.

LOOK UP DON'T GUESS: never assume TCGA project IDs, NCIt codes, or gene coordinates — use GDC_list_projects to confirm project IDs and Progenetix_list_filtering_terms for NCIt codes.

Systematic TCGA/GDC analysis: define cohorts, retrieve clinical data, profile somatic mutations, query copy number variations, run survival analysis, and interpret variants with OncoKB.

When to Use

"What is the mutation frequency of TP53 in TCGA-BRCA?"
"Get survival data for TCGA-LUAD patients"

Cancer Genomics / TCGA Analysis

LOOK UP DON'T GUESS: never assume TCGA project IDs, NCIt codes, or gene coordinates — use GDC_list_projects to confirm project IDs and Progenetix_list_filtering_terms for NCIt codes.

Systematic TCGA/GDC analysis: define cohorts, retrieve clinical data, profile somatic mutations, query copy number variations, run survival analysis, and interpret variants with OncoKB.

When to Use

"What is the mutation frequency of TP53 in TCGA-BRCA?"
"Get survival data for TCGA-LUAD patients"

Data Type	Format	Example
GDC project	TCGA-{ABBREV}	TCGA-BRCA, TCGA-LUAD, TCGA-SKCM
GDC case	UUID	3c6ef4c1-...
NCIt cancer code	NCIT:C######	NCIT:C4017 (breast), NCIT:C3058 (GBM)
RefSeq chromosome	refseq:NC_######	refseq:NC_000007.14 (chr7)

Cancer	Project ID	NCIt Code
Breast	TCGA-BRCA	NCIT:C4017
Lung adenocarcinoma	TCGA-LUAD	NCIT:C3512
Glioblastoma	TCGA-GBM	NCIT:C3058
Melanoma	TCGA-SKCM	NCIT:C3510
Colorectal	TCGA-COAD	NCIT:C4349
Ovarian	TCGA-OV	NCIT:C4908
Prostate	TCGA-PRAD	NCIT:C7378

Tool	Key Params	Returns
GDC_list_projects	(none)	All TCGA/GDC projects with counts
GDC_search_cases	`project_id`, `size`, `offset`	Case UUIDs + metadata
GDC_get_clinical_data	`project_id`, `vital_status`, `gender`, `size`	Demographics + diagnoses + treatments
GDC_get_mutation_frequency	`gene_symbol` (alias: `gene`)	Pan-cancer SSM count
GDC_get_ssm_by_gene	`gene_symbol`, `project_id`, `size`	Per-mutation records with aa_change
GDC_get_survival	`project_id`, `gene_symbol` (optional)	Kaplan-Meier donor array + pValue
Progenetix_search_biosamples	`filters` (NCIt code), `limit`	Biosample records
Progenetix_cnv_search	`reference_name`, `start`, `end`, `variant_type`, `filters`	Biosamples with CNV in region
Progenetix_list_filtering_terms	(none)	All NCIt codes in Progenetix
OncoKB_annotate_variant	`gene`, `variant`, `tumor_type`	Oncogenicity + treatments

Tier	Description	Example
T1	FDA-recognized biomarker with approved therapy	BRAF V600E in melanoma (vemurafenib)
T2	Well-powered clinical study, standard-of-care relevance	KRAS G12C in NSCLC (sotorasib), OncoKB Level 2
T3	Preclinical/small cohort evidence, biological plausibility	Recurrent hotspot in TCGA but no approved therapy
T4	Computational prediction or variant of unknown significance	Low-frequency mutation, no functional data

Tooluniverse Cancer Genomics Tcga

Cancer Genomics / TCGA Analysis

When to Use

Tooluniverse Cancer Genomics Tcga

Cancer Genomics / TCGA Analysis

When to Use

NOT for (use other skills instead)

Workflow Overview

Key Identifiers

Common TCGA Project IDs

Phase 1: Study Selection

Phase 2: Clinical Data

Phase 3: Somatic Mutations

Phase 4: CNV Analysis (Progenetix)

Phase 5: Survival Analysis

Phase 6: Variant Interpretation (OncoKB)

Tool Quick Reference

Example Workflows

Workflow 1: Gene-Centric Mutation + Survival Analysis

Workflow 2: Cohort Clinical Summary

Workflow 3: CNV Analysis for a Gene

Reasoning Framework

Evidence Grading

Interpretation Guidance

Synthesis Questions

Programmatic Access (Beyond Tools)

Limitations

Nanoclaw Repl

Bioinformatics

Smart Explore

Vector Database Engineer

Skin Health Analyzer

Scanpy