When to Use

You need to parse PDB or mmCIF files and access the structure hierarchy (model → chain → residue → atom).
You want to compute geometric measurements such as distances, bond angles, and dihedral angles between atoms/residues.
You need neighbor searches (e.g., find residues/atoms within a cutoff) for contact analysis or local environment inspection.
You want to perform structural comparison, including alignment/superposition and RMSD-style evaluation.
You need to extract, modify, and save structures (e.g., subset chains/residues and write back to PDB/mmCIF).

Key Features

Structure parsing for PDB/mmCIF using Bio.PDB parsers.
Hierarchical traversal and selection of models, chains, residues, and atoms.
Geometry calculations: distance, angle, and dihedral computations using Bio.PDB utilities.
via spatial indexing () for efficient cutoff queries.

When to Use

You need to parse PDB or mmCIF files and access the structure hierarchy (model → chain → residue → atom).
You want to compute geometric measurements such as distances, bond angles, and dihedral angles between atoms/residues.
You need neighbor searches (e.g., find residues/atoms within a cutoff) for contact analysis or local environment inspection.
You want to perform structural comparison, including alignment/superposition and RMSD-style evaluation.
You need to extract, modify, and save structures (e.g., subset chains/residues and write back to PDB/mmCIF).

Key Features

Structure parsing for PDB/mmCIF using Bio.PDB parsers.
Hierarchical traversal and selection of models, chains, residues, and atoms.
Geometry calculations: distance, angle, and dihedral computations using Bio.PDB utilities.
via spatial indexing () for efficient cutoff queries.

import json from pathlib import Path import numpy as np from Bio.PDB import PDBParser, MMCIFParser, NeighborSearch def load_structure(input_path: str, fmt: str): if fmt.lower() in ("pdb", ".pdb"): parser = PDBParser(QUIET=True) elif fmt.lower() in ("cif", "mmcif", ".cif", ".mmcif"): parser = MMCIFParser(QUIET=True) else: raise ValueError(f"Unsupported format: {fmt}") return parser.get_structure("structure", input_path) def main(): config_path = Path("config/task_config.json") with config_path.open("r", encoding="utf-8") as f: cfg = json.load(f) structure = load_structure(cfg["input_path"], cfg["format"]) # Use the first model by default model = next(structure.get_models()) chain = model[cfg["chain_id"]] # Collect atoms for neighbor search all_atoms = list(structure.get_atoms()) ns = NeighborSearch(all_atoms) # Pick a reference atom (first residue in chain that has the requested atom) ref_atom = None for residue in chain.get_residues(): if cfg["atom_name"] in residue: ref_atom = residue[cfg["atom_name"]] break if ref_atom is None: raise RuntimeError(f"No atom '{cfg['atom_name']}' found in chain {cfg['chain_id']}") cutoff = float(cfg["distance_cutoff"]) neighbors = ns.search(ref_atom.coord, cutoff, level="R") # residues within cutoff results = [] for res in neighbors: # Skip hetero/water if desired; here we keep everything and report identifiers res_id = res.get_id() # (hetflag, resseq, icode) results.append( { "chain_id": res.get_parent().id, "resname": res.get_resname(), "resseq": int(res_id[1]), "icode": (res_id[2] or "").strip(), } ) out_path = Path(cfg["output_path"]) out_path.parent.mkdir(parents=True, exist_ok=True) with out_path.open("w", encoding="utf-8") as f: json.dump( { "input_path": cfg["input_path"], "reference": { "chain_id": cfg["chain_id"], "atom_name": cfg["atom_name"], "cutoff": cutoff, }, "neighbor_residues": results, }, f, ensure_ascii=False, indent=2, ) if __name__ == "__main__": main()

Biopython Structure

When to Use

Key Features

Biopython Structure

When to Use

Key Features

Dependencies

Example Usage

Implementation Details

When Not to Use

Required Inputs

Recommended Workflow

Output Contract

Validation and Safety Rules

Failure Handling

Quick Validation

Deep Research

Data Analyst

Academic Researcher

Data Scientist

Biopython

Binary Analysis Patterns