Use when working with SMILES, SMARTS, SMIRKS, molecular fingerprints, or cheminformatics fundamentals. Covers the complete Daylight theory: molecular graph representation, SMILES specification, SMARTS query language, SMIRKS reaction transforms, and fingerprint-based similarity. Based on the Daylight Theory Manual.
The Daylight Theory Manual is the canonical reference for the molecular languages underlying modern cheminformatics: SMILES, SMARTS, SMIRKS, and fingerprints. These are not Daylight-proprietary — they are industry-standard formats implemented in RDKit, OpenBabel, CDK, and every major cheminformatics toolkit.
| Topic | Reference |
|---|---|
| Molecular graph model, aromaticity, chirality, SSSR, reaction representation | references/molecules.md |
| SMILES syntax: atoms, bonds, branches, rings, stereochemistry, reactions | references/smiles.md |
| SMARTS query language: primitives, operators, recursive SMARTS, reaction queries | references/smarts.md |
| SMIRKS reaction transforms: atom maps, grammar, stereochemistry | references/smirks.md |
| Fingerprints: structural keys, path-based FP, folding, Tanimoto, Tversky, all similarity measures |
references/fingerprints.md |
| Chemical database concepts: hash tables, identifiers, in-memory search, pools, hitlists | references/cheminformatics-databases.md |
| Language | Purpose | Example |
|---|---|---|
| SMILES | Encode a specific molecule | CC(=O)Oc1ccccc1C(=O)O |
| SMARTS | Describe a molecular pattern | [OH]c1ccccc1 (phenol) |
| SMIRKS | Encode a reaction transform | [C:1][Br:2]>>[C:1][I:2] |
O in SMARTS matches any aliphatic oxygen; in SMILES it means waterrdkit — implementation of these concepts in Pythonscientific-skills:matchms — spectrum similarity (same mathematical ideas)