Search and analyze cryo-EM maps, single particle structures, tomography datasets, and raw micrograph data from EMDB, EMPIAR, and CryoET Data Portal. Cross-reference with PDB structures and AlphaFold predictions. Use for cryo-EM map discovery, structure fitting analysis, raw data access, and tomography exploration.
Pipeline for discovering and analyzing electron microscopy data across the full resolution spectrum: from 3D density maps (EMDB) to fitted atomic models (PDB), raw micrograph datasets (EMPIAR), and cryo-electron tomography volumes (CryoET Data Portal). Connects EM data to structural biology context via PDB and AlphaFold.
Guiding principles:
Resolution awareness -- always report and interpret map resolution; sub-4A enables atomic modeling, 4-8A enables domain fitting, >8A is shape-level
Map before model -- the density map is the primary experimental data; fitted models are interpretations
Method matters -- single particle analysis, tomography, 2D crystallography, and helical reconstruction have different strengths and limitations
Raw data value -- EMPIAR raw data enables reprocessing with newer algorithms; always note availability
Cross-reference structures -- connect EMDB maps to PDB entries and AlphaFold predictions for completeness
English-first queries -- use English terms in tool calls
Related Skills
EM resolution determines what you can see. TEM resolves individual protein complexes (~2nm). Cryo-EM achieves near-atomic resolution (<4Å) for large complexes. SEM shows surface topology. Choose the right EM modality for the question.
LOOK UP, DON'T GUESS
When uncertain about any scientific fact, SEARCH databases first rather than reasoning from memory. A database-verified answer is always more reliable than a guess.
COMPUTE, DON'T DESCRIBE
When analysis requires computation (statistics, data processing, scoring, enrichment), write and run Python code via Bash. Don't describe what you would do — execute it and report actual results. Use ToolUniverse tools to retrieve data, then Python (pandas, scipy, statsmodels, matplotlib) to analyze it.
When to Use
Typical triggers:
"Find cryo-EM structures of [protein/complex]"
"What EMDB maps are available for [target]?"
"Get raw micrograph data for [structure]"
"Find tomography datasets for [organelle/cell type]"
"What is the resolution of [EMDB entry]?"
"Cross-reference this EM map with PDB models"
"Find cryo-ET datasets for [sample]"
Not this skill: For X-ray crystallography or NMR structures, use PDB search tools directly. For protein structure prediction, use tooluniverse-protein-structure.
Core Databases
Database
Content
Best For
EMDB
3D EM density maps (>40K entries)
Finding processed maps, resolution data, fitting info
Output: run details, tilt parameters, voxel spacing
Workflow:
Search CryoET Data Portal for the target organism/structure
Get dataset details including sample preparation and imaging parameters
Explore individual runs for tilt series specifications
Note voxel spacing and tomogram dimensions
Tomography vs single particle: Tomography preserves cellular context (in situ) but typically achieves lower resolution. Single particle gives higher resolution but requires purified samples.
Phase 5: Cross-Reference & Context
Objective: Connect EM data to broader structural biology context.
Tools:
alphafold_get_prediction -- get AlphaFold predicted structure
PubMed_search_articles -- find publications describing the EM work
Input: query (search term), optional limit
Output: articles with title, abstract, PMID
Workflow:
For proteins with EM structures, get AlphaFold predictions for comparison
Note regions where AlphaFold confidence is low (pLDDT < 70) -- these may be flexible and harder to resolve by EM
Search PubMed for methodological papers and biological insights from the EM studies
Cross-reference EMDB/PDB/EMPIAR accessions in publications
Phase 6: Interpretation & Recommendations
Don't just list maps — help the user choose the RIGHT map for their purpose.
Decision matrix: Which map should I use?
Purpose
Best Resolution
Method
Priority Criteria
Atomic model building
< 3.5A
Single particle
Highest resolution with fitted PDB model
Drug binding site analysis
< 3.0A
Single particle
Must resolve side chains in binding pocket
Domain architecture
4-8A
Single particle or subtomogram avg
Large complexes where domains need fitting
Conformational states
< 4.5A
Single particle (multiple classes)
Look for entries with multiple maps from same dataset
Cellular context
15-40A
Cryo-ET
Tomographic datasets showing in-situ arrangement
Reprocessing
Any
Any
Must have EMPIAR raw data; prefer recent datasets (better detectors)
Quality assessment checklist:
Resolution reported is the "gold standard" FSC 0.143 cutoff? (some older entries use 0.5 cutoff — inflates resolution)
Map sharpened appropriately? (over-sharpened maps can look better but contain artifacts)
Fitting statistics available? (cross-correlation > 0.7 is acceptable)
Multiple maps from same sample? (suggests conformational heterogeneity — important for drug design)
Resolution trend analysis: If multiple maps exist over time, note the resolution trajectory. Improvement from 6A (2015) to 2.8A (2023) suggests the sample is amenable to high-resolution single particle analysis with modern hardware.