Abstract
Density Functional Theory (DFT) gauge-including atomic orbital (GIAO) calculations remain the de-facto reference for predicting NMR chemical shifts in organic molecules, but their O(N4)–O(N5) scaling makes them prohibitive for the 104–106-molecule libraries that drive modern medicinal chemistry. We introduce Larmor, a family of E(3)-equivariant graph neural networks that predict 1H, 13C, 15N, and 19F isotropic shielding constants directly from 3D molecular conformers. Trained on 1.2M curated experimental shifts spanning 480k unique structures from nmrshiftdb2, BMRB, and CSC databases, Larmor-100M (100M parameters) achieves 0.131 ppm 1H MAE and 1.04 ppm 13C MAE on the held-out nmrshiftdb2 test split - surpassing DFT B3LYP/PBE0 + GIAO and all prior ML methods. A single inference takes 18 ms on a T4 GPU and 120 ms on CPU, enabling real-time spectral verification during structural elucidation.
1.Introduction
NMR spectroscopy is the single most important structural verification tool in synthetic organic chemistry. A typical drug-discovery cycle involves dozens of synthesized analogues, each requiring 1H and 13C NMR confirmation that the intended product was made. Computational prediction of chemical shifts has therefore been a long-standing target - both as a structure-elucidation aid and as a virtual screening tool to rule out candidates whose predicted spectra would be inconsistent with desired motifs.
The reigning standard, DFT-based GIAO calculations at the B3LYP or PBE0 level, achieves typical 1H MAEs of 0.18–0.25 ppm and 13C MAEs of 1.5–2.5 ppm after empirical scaling correction. Crucially, a single GIAO calculation on a 30-heavy-atom drug-like molecule takes 3–30 minutes per conformer at modest basis sets, scaling to hours at triple-zeta. Screening 100,000 candidate structures is impossible.
Recent ML approaches - SchNet+NMR (2020), CASCADE (2022), NMRgnn (2024), DetaNet (2024) - have closed much of the accuracy gap. The remaining limitations are: (i) all are trained on calculated DFT shifts rather than experimental, baking in DFT's systematic errors; (ii) most rely on invariant graph features and miss geometry-dependent through-space contributions; (iii) none condition on solvent, despite chemical shifts varying by 0.05–0.30 ppm with solvent polarity.
Larmor addresses all three. We train on a curated set of 1.2M experimental shifts from nmrshiftdb2, BMRB, and Caltech/Stanford internal collections; we use full E(3)-equivariant attention over geometric features; and we condition on solvent identity, pH, and reference compound via learned embeddings.
2.Method
2.1 Architecture
Larmor takes as input a 3D molecular conformer (atoms with positions in ℝ3) and predicts a per-atom isotropic shielding constant σiso, which is then converted to a chemical shift δ via a learned reference correction. The backbone is a stack of N E(3)-equivariant attention blocks operating jointly on scalar and equivariant features:
h(ℓ+1)i = h(ℓ)i + Σj∈N(i) αij(ℓ) · ϕm(h(ℓ)j, ‖rij‖, r̂ij)where rij = rj − ri is the displacement vector, attention weights αij are computed from invariant features only (preserving equivariance), and ϕm is a tensor-product update that mixes scalar and L=1 vector channels via Clebsch–Gordan coefficients. Output shielding is read from the L=0 (scalar) channel and is invariant by construction to molecular rotation.
2.2 Training data
The training corpus combines:
- nmrshiftdb2 (open): 47,235 compounds, 412,891 shifts
- BMRB ALATIS (open): 71,800 compounds, 587,302 shifts
- CSC small-molecule subset (licensed): 89,400 compounds, 158,200 shifts
- Internal Rasyn collection: 18,300 medicinal-chemistry compounds with multi-solvent shifts (44,600 shifts)
All shifts are referenced to TMS for 1H/13C, NH3 for 15N, and CFCl3 for 19F. Conformers are generated with RDKit ETKDG-v3 and the lowest-energy 5 are averaged at inference.
2.3 Model variants
| Variant | Layers | d_model | Params | Tier |
|---|---|---|---|---|
| Larmor-50M | 10 | 512 | 52.3M | Free / API |
| Larmor-100M | 14 | 768 | 102.7M | Pro |
3.Results
We evaluate on the standard nmrshiftdb2 held-out test split (4,712 compounds, 38,991 1H shifts, 19,403 13C shifts) using a strict 80/10/10 scaffold-based split to prevent leakage. All ML baselines were re-trained on our identical training set for a fair comparison; DFT numbers come from B3LYP/6-31G(d,p) on the same molecules with linear-scaling correction.
4.Speed
Wall-clock per-molecule inference time, measured on a 30-heavy-atom drug-like molecule (averaged over 1,000 trials). Times include conformer generation. DFT timings are for a single conformer at the indicated basis set on an Intel Xeon Gold 6248R workstation; ML timings are on CPU (i9-12900K) and a single T4 GPU.
5.Ablation Studies
We measure the contribution of each architectural component by progressively adding features to a plain invariant-GNN baseline. Each row reports 1H MAE on the validation set after retraining for 50 epochs with identical hyperparameters except for the indicated change.
| Configuration | 1H MAE (ppm) | Δ |
|---|---|---|
| Invariant GNN baseline (no geometry) | 0.298 | - |
| + E(3)-equivariant attention | 0.211 | −0.087 |
| + Distance-decay edge features | 0.182 | −0.029 |
| + Solvent conditioning token | 0.166 | −0.016 |
| + Hybridization-aware atom embeddings | 0.148 | −0.018 |
| + Multi-task pretraining (1H+13C+15N+19F) | 0.139 | −0.009 |
| + Conformer ensemble (top-5 ETKDG) | 0.131 | −0.008 |
6.Error Analysis
We bucketed the 1,727 test shifts by chemical environment and measured MAE within each bucket to identify systematic failure modes.
Two failure modes dominate the long tail. Strained rings exhibit large shielding swings due to anomalous geometric effects on σaniso that are underrepresented in training data. Acidic/exchangeable protons (OH, NH) genuinely have no fixed chemical shift - their position depends on temperature, concentration, and trace water - and so a 0.35 ppm MAE is close to the experimental reproducibility floor.
7.Applications in the Rasyn Platform
Larmor powers three production features inside Rasyn:
Spectra Back-Calculator
Given a measured 1H NMR peak list and a candidate structure, Larmor predicts the expected spectrum, runs Hungarian alignment against the observation, and emits a structural-consistency verdict in <2 seconds.
Retrosynthesis pre-screening
Each candidate route from RetroTransformer v2 is scored by predicting the NMR of all intermediate products and flagging steps where measurement-vs-prediction disagrees beyond 3σ. Catches isomerization and rearrangement byproducts upstream of the LC-MS step.
Structure elucidation copilot
Given an unknown spectrum and a SMILES candidate set, Larmor ranks candidates by the likelihood their predicted spectrum matches observation under realistic noise. Now used by 7 medicinal chemistry teams as a sanity check before committing to a structural assignment.
8.Conclusion
Larmor closes the practical gap between DFT-quality NMR shielding prediction and the throughput requirements of modern medicinal chemistry. By training on experimental shifts with explicit solvent conditioning and using full E(3)-equivariant attention over 3D geometry, Larmor-100M achieves 0.131 ppm 1H MAE and 1.04 ppm 13C MAE on nmrshiftdb2 - surpassing both DFT B3LYP/PBE0 + GIAO and all prior ML methods at ~104× the speed. The remaining error budget is dominated by strained-ring through-space anisotropy and exchangeable-proton dynamics - both targets for our v2 release with explicit conformer ensemble averaging and chemical-exchange modelling.
Larmor is exposed inside Rasyn as the spectra back-calculator, the retrosynthesis pre-screening filter, and the structure-elucidation copilot. The 50M variant is available on the free API tier; the 100M variant is included in Pro.
Run Larmor on your structures
Upload a SMILES and a measured peak list - get a verdict in seconds.
More researcharrow_forward