Design multi-omics integration strategies for transcriptomics, proteomics, and metabolomics data analysis
Designs multi-omics (transcriptomics RNA, proteomics Pro, metabolomics Met) joint analysis schemes, performs cross-validation at the pathway level, and provides systems biology-level integrated analysis strategies.
.
├── SKILL.md # This file - Skill documentation
├── config/
│ └── pathways.json # Pathway database configuration
├── scripts/
│ └── main.py # Main analysis script
├── templates/
│ └── report_template.md # Analysis report template
└── examples/
└── sample_data/ # Sample datasets
| File | Format | Description |
|---|---|---|
rna_data.csv | CSV | Transcriptomics data: Gene ID, expression value, differential analysis results |
pro_data.csv | CSV | Proteomics data: Protein ID, abundance value, differential analysis results |
met_data.csv | CSV | Metabolomics data: Metabolite ID, concentration value, differential analysis results |
gene_id,gene_name,log2fc,pvalue,padj,sample_A,sample_B,...
ENSG00000139618,BRCA1,1.23,0.001,0.005,12.5,13.2,...
protein_id,gene_name,log2fc,pvalue,padj,sample_A,sample_B,...
P38398,BRCA1,0.85,0.002,0.008,2450,2890,...
metabolite_id,metabolite_name,kegg_id,log2fc,pvalue,padj,...
C00187,Cholesterol,C00187,-1.45,0.003,0.012,...
Supported databases:
integration_report.md)# Multi-Omics Integration Analysis Report
## Executive Summary
- Sample count: RNA=30, Pro=28, Met=25
- Mapping success rate: RNA-Pro=85%, Pro-Met=62%
- Pathway coverage: 342 KEGG pathways
## Cross-Validation Results
### Highly Consistent Pathways (Score > 0.8)
1. Glycolysis/Gluconeogenesis (Score=0.92)
2. Citrate cycle (TCA cycle) (Score=0.88)
### Conflicting Pathways (Score < -0.3)
1. Fatty acid biosynthesis (Score=-0.45)
## Recommendations
- Focus on: Energy metabolism-related pathways
- Needs verification: Lipid metabolism pathway data quality
This tool generates analysis results that can be visualized using external tools. Users may export results to:
| Chart Type | Purpose | External Tool Required |
|---|---|---|
| Circos Plot | Cross-omics relationship panorama | matplotlib/circlize (user-installed) |
| Pathway Heatmap | Pathway-level changes | seaborn/complexheatmap (user-installed) |
| Sankey Diagram | Data flow mapping | plotly (user-installed) |
| Network Graph | Molecular interaction network | networkx/cytoscape (networkx is included) |
| Correlation Matrix | Cross-omics correlation | seaborn (user-installed) |
| Bubble Plot | Integrated enrichment analysis | ggplot2/plotly (user-installed) |
Note: This skill focuses on data integration and analysis. Visualization requires separate installation of plotting libraries by the user.
| File | Description |
|---|---|
mapped_ids.json | ID mapping results |
pathway_scores.csv | Pathway cross-validation scores |
consistency_matrix.csv | Cross-omics consistency matrix |
network_edges.csv | Network edge list |
report.html | Interactive HTML report |
python scripts/main.py \
--rna rna_data.csv \
--pro pro_data.csv \
--met met_data.csv \
--output ./results
python scripts/main.py \
--rna rna_data.csv \
--pro pro_data.csv \
--met met_data.csv \
--pathway-db KEGG,Reactome \
--id-mapping config/mapping.json \
--method correlation+enrichment+network \
--output ./results \
--format html,csv,json
{
"databases": {
"KEGG": {
"enabled": true,
"organism": "hsa",
"min_genes": 3
},
"Reactome": {
"enabled": true,
"min_genes": 5
}
},
"mapping": {
"rna_to_protein": "gene_symbol",
"protein_to_metabolite": "enzyme_commission"
}
}
| Risk Indicator | Assessment | Level |
|---|---|---|
| Code Execution | Python/R scripts executed locally | Medium |
| Network Access | No external API calls | Low |
| File System Access | Read input files, write output files | Medium |
| Instruction Tampering | Standard prompt guidelines | Low |
| Data Exposure | Output files saved to workspace | Low |
# Python dependencies
pip install -r requirements.txt
| Parameter | Type | Default | Description |
|---|---|---|---|
--rna | str | Required | |
--pro | str | Required | |
--met | str | Required | |
--output | str | './results' | |
--databases | str | 'KEGG' | |
--create-sample | str | Required | Create sample data for testing |
--format | str | 'md |