Name: Kegg Query
Author: PharMolix

搜索技能.../

Kegg Query | Skills Pool

from scripts.kegg_api import kegg_get, parse_pathway_entry

# Get pathway by ID (e.g., hsa00010 for Glycolysis)
entry = kegg_get("hsa00010")
pathway = parse_pathway_entry(entry)

# Access parsed data
print(f"Genes: {len(pathway['genes'])}")      # 50+ genes
print(f"Compounds: {len(pathway['compounds'])}")  # 30+ compounds

from scripts.kegg_api import kegg_find, kegg_get, parse_disease_entry

# Step 1: Search for disease
results = kegg_find("disease", "diabetes")
# Returns multiple matches including Type 2 diabetes (H00409)

# Step 2: Get disease details
entry = kegg_get("ds:H00409")
disease = parse_disease_entry(entry)

# Access drugs and targets
print(f"Drugs: {len(disease['drugs'])}")    # 60+ drugs
print(f"Genes: {len(disease['genes'])}")    # 20+ genes

{
  "id": "D00109",
  "names": ["Aspirin", "Acetylsalicylic acid"],
  "formula": "C9H8O4",
  "efficacy": ["Analgesic", "Anti-inflammatory", "Antipyretic", "COX inhibitor"],
  "targets": [
    {"gene": "PTGS1", "uniprot": "P23219", "ko": "K00509"},
    {"gene": "PTGS2", "uniprot": "P35354", "ko": "K11987"}
  ],
  "pathways": ["hsa00590", "hsa04611"],
  "diseases": ["Myocardial infarction", "Unstable angina"]
}

{
  "id": "hsa00010",
  "name": "Glycolysis / Gluconeogenesis",
  "organism": "Homo sapiens",
  "description": "Glycolysis is the process...",
  "genes": [
    {"id": "10327", "symbol": "AKR1A1", "ko": "K00002", "ec": "1.1.1.2"},
    {"id": "3939", "symbol": "LDHA", "ko": "K00016", "ec": "1.1.1.27"}
  ],
  "compounds": [
    {"id": "C00031", "name": "D-Glucose"},
    {"id": "C00022", "name": "Pyruvate"}
  ],
  "modules": ["hsa_M00001", "hsa_M00002", "hsa_M00003"]
}

{
  "id": "H00409",
  "name": "Type 2 diabetes mellitus",
  "category": "Endocrine and metabolic disease",
  "description": "T2DM is characterized by chronic hyperglycemia...",
  "genes": [
    {"symbol": "CAPN10", "ko": "K08579"},
    {"symbol": "TCF7L2", "ko": "K04491"}
  ],
  "drugs": [
    {"id": "D00944", "name": "Metformin hydrochloride"},
    {"id": "D06404", "name": "Liraglutide"}
  ],
  "pathways": ["hsa04930", "hsa04911"]
}

Operation	URL Pattern	Description
`info`	`/info/{database}`	Database statistics
`list`	`/list/{database}`	List all entries
`find`	`/find/{database}/{query}`	Search by keyword
`get`	`/get/{entry_id}`	Retrieve entry
`link`	`/link/{target}/{source}`	Cross-references
`conv`	`/conv/{target}/{source}`	ID conversion

Error	Solution
Entry not found	Verify ID format (e.g., D00109, not aspirin)
Multiple matches	Use `kegg_find` first to get exact ID
Timeout	Reduce query complexity, retry with delay
Rate limited	KEGG allows ~10 requests/second; add delays

from open_biomed.data import Molecule, Protein
from open_biomed.tools.tool_registry import TOOLS

# Convert KEGG compound to Molecule
compound_entry = kegg_get("cpd:C00031")  # Glucose
mol_file = kegg_get("C00031", option="mol")  # Get MOL format
# molecule = Molecule.from_mol_file(mol_file)

# Get protein from KEGG gene
gene_entry = kegg_get("hsa:5742")  # PTGS1
# Use UniProt ID to fetch protein
protein_tool = TOOLS["protein_uniprot_request"]
proteins, _ = protein_tool.run(accession="P23219")

Kegg Query

When to Use

Workflow

Use Case 1: Drug Information Lookup

Kegg Query

When to Use

Workflow

Use Case 1: Drug Information Lookup

Use Case 2: Pathway Analysis

Use Case 3: Disease-Drug-Target Discovery

Expected Outputs

Drug Entry (JSON)

Pathway Entry (JSON)

Disease Entry (JSON)

KEGG API Reference

Error Handling

Integration with OpenBioMed

References

Healthcare Cdss Patterns

Drug Discovery

Qmd

Attack Tree Construction

Azure Ai Anomalydetector Java

Viboscope