Scripts for Wasserstein deconvolution of 1H NMR mixture spectra against reference spectra, reaction product prediction, time-series kinetics, and spectral plotting.
The agent should use this skill's scripts when:
reaction-to-nmr-quantification.md or nmr-reaction-kinetics.md) calls for deconvolution, product prediction, kinetics analysis, or spectral plotting.For end-to-end workflows that chain this skill with other skills, see: .agents/workflows/reaction-to-nmr-quantification.md and .agents/workflows/nmr-reaction-kinetics.md.
chem-plot-digitizer skill for that step.chem-nmr-predict skill for that step.drug-db-pubchem skill for that step.| Script | Purpose | Key Inputs | Key Outputs |
|---|---|---|---|
predict_products.py | Predict reaction products via ReactionT5 (HuggingFace API) | --reactant_smiles, --reagent_smiles | JSON with predicted product SMILES |
deconvolve.py | Wasserstein deconvolution of mixture against references | mixture file + reference files + --protons | proportions, Wasserstein distance, plot |
kinetics.py | Time-series deconvolution across multiple time points | --refs, --timepoints, --times | kinetics.csv + kinetics_plot.png |
plot.py | Overlay or stack NMR spectra for visual comparison | spectrum files + --labels | plot image |
spectra.py | I/O utilities (imported by other scripts, not called directly) | -- | -- |
The agent should use this script to predict reaction products from reactant and reagent SMILES via the ReactionT5 model.
# Env: nmr-agent
export HF_TOKEN=<token>
python .agents/skills/chem-nmr-analysis/scripts/predict_products.py \
--reactant_smiles "C1CCC(=O)C1" \
--reagent_smiles "[BH3-]" \
--output <research_dir>/predicted_products.json
The agent should use this script to determine mole fractions of known components in a mixture spectrum via Wasserstein-distance deconvolution.
# Env: nmr-agent
python .agents/skills/chem-nmr-analysis/scripts/deconvolve.py \
mixture.csv ref_borneol.xy ref_isoborneol.xy \
--protons 18 18 \
--names "borneol" "isoborneol" \
--baseline-correct \
--plot <research_dir>/deconvolution_result.png \
--json
The agent should use this script when the user has crude NMR spectra recorded at multiple time points during a reaction.
# Env: nmr-agent
python .agents/skills/chem-nmr-analysis/scripts/kinetics.py \
--refs ref1.xy ref2.xy \
--timepoints t0.csv t10.csv t20.csv \
--times 0 10 20 \
--time_unit min \
--protons 18 18 \
--names "reactant" "product" \
--baseline_correct \
--output_dir <research_dir>/kinetics/
The agent should use this script to overlay or stack spectra for visual inspection before or after deconvolution.
# Env: nmr-agent
python .agents/skills/chem-nmr-analysis/scripts/plot.py \
mixture.csv ref_borneol.xy ref_isoborneol.xy \
--labels "Mixture" "borneol" "isoborneol" \
--title "Mixture vs References" \
--output <research_dir>/spectra_overview.png
| Argument | Required | Description |
|---|---|---|
--protons | Yes | Number of 1H protons per molecule for each reference component. Critical for converting area fractions to mole fractions. The agent must look this up from the molecular formula or count from the SMILES. |
--names | No | Human-readable labels matching the order of reference files. The agent should always provide these for interpretable output. |
--baseline-correct | No | Shifts each spectrum so minimum intensity = 0. The agent should use this for digitized spectra or SPINUS-predicted spectra. |
--kappa | No | Denoising penalty (default 0.25). The agent should not change this unless instructed. |
--plot | No | Output plot path. The agent should always generate a plot. |
--json | No | Emit machine-readable JSON output. The agent should always use this. |
The deconvolution output contains proportions and a Wasserstein distance (WD) indicating fit quality.
If/Then rules for Wasserstein distance:
If proportions do not sum to ~1.0 -- the agent should note that the "noise" fraction represents unmatched signal and explain what it might be.
Verification: After deconvolution, the agent must inspect the deconvolution plot, check the residual panel for large residuals, and verify that proportions are chemically reasonable. If results contradict known chemistry, the agent should flag this to the user rather than silently accepting.
All spectrum files must be two-column numeric data (ppm, intensity):
.csv -- comma-delimited (auto-detected).xy -- tab-delimited (auto-detected).tsv -- tab-delimitedIf the user provides a Mnova export -- the agent should add --mnova flag to deconvolve.py.
All scripts in this skill use the nmr-agent conda environment:
mamba activate nmr-agent
Install: conda-envs/nmr-agent/install.sh
Required packages: numpy, scipy (>= 1.7), matplotlib, rdkit, requests, nmrsim, scikit-learn.
HF_TOKEN (for ReactionT5 product prediction): the agent should check if HF_TOKEN is set before attempting product prediction. If not set, the agent should ask the user to provide it or provide product SMILES directly.
| Failure | Symptom | Agent Action |
|---|---|---|
| SPINUS returns no atoms | chem-nmr-predict prints FAILED for a compound | The SMILES may be invalid or the molecule too large. The agent should verify the SMILES and retry, or ask the user for a measured reference spectrum. |
| ReactionT5 returns no products | predict_products.py returns empty products list | The agent should use its own chemistry knowledge to suggest products and ask the user to confirm. |
| Wasserstein distance very high (> 0.15) | Deconvolution result unreliable | Missing component, ppm offset, or baseline issue. The agent should investigate and not report proportions as reliable. |
| Proportions are all near zero except one | One component dominates | May be correct (e.g., >95% product), or may indicate missing starting material reference. The agent should check. |
| nmrsim simulation fails | Warning in chem-nmr-predict output | Falls back to stick spectrum (shifts only, no multiplet structure). The agent should note reduced accuracy of that reference. |
| kinetics curves are non-monotonic | Composition jumps up and down over time | Likely a mislabeled time point, phasing issue, or missing component. The agent should investigate individual spectra. |
Author(s): Jesus Diaz Sanchez, Magdalena lederbauer Contact(s): GitHub @jdsanc, GitHub @mlederbauer