Name: Pubchem Compound Search
Author: jaechang-hits

Overview

PubChem is the world's largest freely available chemical database with 110M+ compounds. This skill covers searching compounds by name, structure, or identifier, retrieving molecular properties, performing similarity/substructure searches, and accessing bioactivity data through PubChemPy (Python wrapper) and PUG-REST API (direct HTTP).

When to Use

Looking up a compound by name, CAS number, or SMILES to get its PubChem CID and properties
Retrieving molecular properties (molecular weight, LogP, TPSA, H-bond counts) for known compounds
Finding structurally similar compounds via Tanimoto similarity search
Searching for compounds containing a specific substructure (pharmacophore screening)
Converting between chemical identifier formats (name ↔ CID ↔ SMILES ↔ InChI)
Accessing bioactivity screening data (assay results, active/inactive status)
Batch property comparison across a set of drug candidates
For local molecular computation (fingerprints, descriptors, 3D conformers), use rdkit instead
For querying multiple databases (UniProt, KEGG, ChEMBL) in one workflow, use instead

Parameter	Function	Default	Range / Options	Effect
`namespace`	`get_compounds`	required	`"name"`, `"cid"`, `"smiles"`, `"inchi"`, `"formula"`	Identifier type for search
`searchtype`	`get_compounds`	`None`	`"similarity"`, `"substructure"`	Type of structure search
`Threshold`	similarity search	`90`	`0`-`100`	Tanimoto similarity cutoff (%)
`MaxRecords`	structure search	`None`	`1`-`10000`	Maximum results returned
`properties`	`get_properties`	required	See API reference	Which molecular properties to retrieve
`record_type`	`download`	`"2d"`	`"2d"`, `"3d"`	Structure dimensionality

Problem	Cause	Solution
`IndexError: list index out of range`	No compounds found for query	Check spelling; try alternative names or CID
Request timeout (>30s)	Large similarity/substructure search	Reduce `MaxRecords`; PubChemPy handles async polling automatically
Empty property values (`None`)	Property not available for this compound	Check if property exists before use: `if comp.xlogp is not None`
`HTTP 503 Service Unavailable`	Rate limit exceeded	Add `time.sleep(0.25)` between requests; max 5 req/sec
`BadRequestError`	Invalid SMILES or identifier	Validate SMILES syntax; use canonical SMILES from RDKit
Formula search returns too many hits	Common formula shared by many isomers	Use SMILES or InChI for more specific searches
Bioactivity API returns empty	Compound has no bioassay data	Not all compounds have been tested; check PubChem web interface

Pubchem Compound Search

Pubchem Compound Search

Overview

When to Use

Prerequisites

Quick Start

Workflow

Step 1: Compound Search

Step 2: Property Retrieval

Step 3: Similarity Search

Step 4: Substructure Search

Step 5: Bioactivity Data Access

Step 6: Batch Property Comparison

Step 7: Identifier Format Conversion

Key Parameters

Common Recipes

Recipe: Drug-Likeness Screening (Lipinski's Rule of Five)

Recipe: Get All Synonyms for a Compound

Recipe: Download 2D Structure Image

Expected Outputs

Troubleshooting

References

Healthcare Cdss Patterns

Drug Discovery

Qmd

Attack Tree Construction

Azure Ai Anomalydetector Java

Viboscope