COMPUTE, DON'T DESCRIBE

Replication: Search literature and datasets (DataCite, GEO, ArrayExpress) for independent datasets where the same finding can be tested. A finding that replicates in an independent cohort is much stronger.
Biological plausibility: Does the mechanism make biological sense? Check if animal or cell models support it (PubMed search for &quot;[gene] knockout [phenotype]&quot; or &quot;[chemical] exposure [cell type]&quot;).
Genetic support: Check if GWAS evidence supports the direction of effect. If your analysis says gene X is protective but GWAS shows risk alleles increase X expression, there is a contradiction to resolve.
Dose-response: If available, check whether the effect increases with dose. A dose-response relationship strengthens causal inference.
Negative controls: If possible, test the same analysis on a finding where you expect no association. If the negative control also shows an association, suspect a methodological artifact.

When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do -- execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.

Data Integration Analysis

Bridge the gap between statistical results and biological understanding. After any computational analysis produces significant findings, this skill teaches how to interpret them using ToolUniverse's biological knowledge tools -- the key advantage over platforms that only do data analysis.

IMPORTANT: Always use English terms in tool calls (gene names, pathway names, organism names), even if the user writes in another language. Respond in the user's language.

When to Use This Skill

Apply when:

Statistical analysis produced a list of significant genes, variants, metabolites, or exposures
Users want to go beyond p-values to understand WHY something is significant