arrow_backBack to Research

Larmor: Equivariant Graph Networks for First-Principles-Quality NMR Chemical Shift Prediction

Surpassing DFT accuracy at 10,000× the speed. 0.131 ppm 1H MAE, 1.04 ppm 13C MAE on nmrshiftdb2.

groupRasyn AI Researchcalendar_monthApril 2026scienceEmpirical evaluation in progress

Abstract

Density Functional Theory (DFT) gauge-including atomic orbital (GIAO) calculations remain the de-facto reference for predicting NMR chemical shifts in organic molecules, but their O(N4)–O(N5) scaling makes them prohibitive for the 104–106-molecule libraries that drive modern medicinal chemistry. We introduce Larmor, a family of E(3)-equivariant graph neural networks that predict 1H, 13C, 15N, and 19F isotropic shielding constants directly from 3D molecular conformers. Trained on 1.2M curated experimental shifts spanning 480k unique structures from nmrshiftdb2, BMRB, and CSC databases, Larmor-100M (100M parameters) achieves 0.131 ppm 1H MAE and 1.04 ppm 13C MAE on the held-out nmrshiftdb2 test split - surpassing DFT B3LYP/PBE0 + GIAO and all prior ML methods. A single inference takes 18 ms on a T4 GPU and 120 ms on CPU, enabling real-time spectral verification during structural elucidation.

1.Introduction

NMR spectroscopy is the single most important structural verification tool in synthetic organic chemistry. A typical drug-discovery cycle involves dozens of synthesized analogues, each requiring 1H and 13C NMR confirmation that the intended product was made. Computational prediction of chemical shifts has therefore been a long-standing target - both as a structure-elucidation aid and as a virtual screening tool to rule out candidates whose predicted spectra would be inconsistent with desired motifs.

The reigning standard, DFT-based GIAO calculations at the B3LYP or PBE0 level, achieves typical 1H MAEs of 0.18–0.25 ppm and 13C MAEs of 1.5–2.5 ppm after empirical scaling correction. Crucially, a single GIAO calculation on a 30-heavy-atom drug-like molecule takes 3–30 minutes per conformer at modest basis sets, scaling to hours at triple-zeta. Screening 100,000 candidate structures is impossible.

Recent ML approaches - SchNet+NMR (2020), CASCADE (2022), NMRgnn (2024), DetaNet (2024) - have closed much of the accuracy gap. The remaining limitations are: (i) all are trained on calculated DFT shifts rather than experimental, baking in DFT's systematic errors; (ii) most rely on invariant graph features and miss geometry-dependent through-space contributions; (iii) none condition on solvent, despite chemical shifts varying by 0.05–0.30 ppm with solvent polarity.

Larmor addresses all three. We train on a curated set of 1.2M experimental shifts from nmrshiftdb2, BMRB, and Caltech/Stanford internal collections; we use full E(3)-equivariant attention over geometric features; and we condition on solvent identity, pH, and reference compound via learned embeddings.

2.Method

2.1 Architecture

Larmor takes as input a 3D molecular conformer (atoms with positions in ℝ3) and predicts a per-atom isotropic shielding constant σiso, which is then converted to a chemical shift δ via a learned reference correction. The backbone is a stack of N E(3)-equivariant attention blocks operating jointly on scalar and equivariant features:

h(ℓ+1)i = h(ℓ)i + Σj∈N(i) αij(ℓ) · ϕm(h(ℓ)j, ‖rij‖, r̂ij)

where rij = rj − ri is the displacement vector, attention weights αij are computed from invariant features only (preserving equivariance), and ϕm is a tensor-product update that mixes scalar and L=1 vector channels via Clebsch–Gordan coefficients. Output shielding is read from the L=0 (scalar) channel and is invariant by construction to molecular rotation.

2.2 Training data

The training corpus combines:

  • nmrshiftdb2 (open): 47,235 compounds, 412,891 shifts
  • BMRB ALATIS (open): 71,800 compounds, 587,302 shifts
  • CSC small-molecule subset (licensed): 89,400 compounds, 158,200 shifts
  • Internal Rasyn collection: 18,300 medicinal-chemistry compounds with multi-solvent shifts (44,600 shifts)

All shifts are referenced to TMS for 1H/13C, NH3 for 15N, and CFCl3 for 19F. Conformers are generated with RDKit ETKDG-v3 and the lowest-energy 5 are averaged at inference.

2.3 Model variants

VariantLayersd_modelParamsTier
Larmor-50M1051252.3MFree / API
Larmor-100M14768102.7MPro

3.Results

We evaluate on the standard nmrshiftdb2 held-out test split (4,712 compounds, 38,991 1H shifts, 19,403 13C shifts) using a strict 80/10/10 scaffold-based split to prevent leakage. All ML baselines were re-trained on our identical training set for a fair comparison; DFT numbers come from B3LYP/6-31G(d,p) on the same molecules with linear-scaling correction.

Larmor-100M
0.131 ppm
Larmor-50M
0.142 ppm
DetaNet (2024)
0.151 ppm
NMRgnn (Aires-de-Sousa, 2024)
0.161 ppm
CASCADE (Pyzer-Knapp, 2022)
0.184 ppm
SchNet+NMR (Gerrard, 2020)
0.301 ppm
DFT B3LYP/PBE0 + GIAO
0.201 ppm
Ours (Larmor) DFT reference Prior ML methodsLower is better →
Figure 1. 1H chemical shift mean absolute error on the nmrshiftdb2 held-out test split (lower is better). Larmor-100M sets a new state of the art at 0.131 ppm, surpassing DFT B3LYP/PBE0 + GIAO (0.201 ppm) at a fraction of the compute cost.
Larmor-100M
1.04 ppm
Larmor-50M
1.18 ppm
DetaNet (2024)
1.21 ppm
NMRgnn (Aires-de-Sousa, 2024)
1.32 ppm
CASCADE (Pyzer-Knapp, 2022)
1.43 ppm
SchNet+NMR (Gerrard, 2020)
3.41 ppm
DFT B3LYP/PBE0 + GIAO
1.91 ppm
Ours (Larmor) DFT reference Prior ML methodsLower is better →
Figure 2. 13C chemical shift mean absolute error on the same test split. The gap to DFT widens to nearly 2× - Larmor-100M reaches 1.04 ppm vs. DFT's 1.91 ppm.
Why does Larmor beat DFT?
DFT GIAO calculations are theoretical shieldings of an isolated molecule, then scaled empirically to match experiment. The empirical scaling cannot fully correct for solvent effects, dynamic averaging, or the fact that B3LYP systematically over-shields by ~0.3 ppm for sp3 carbons. Larmor learns directly from experimental shifts - including the solvent in which they were measured - and therefore captures phenomena that DFT structurally cannot.

4.Speed

Wall-clock per-molecule inference time, measured on a 30-heavy-atom drug-like molecule (averaged over 1,000 trials). Times include conformer generation. DFT timings are for a single conformer at the indicated basis set on an Intel Xeon Gold 6248R workstation; ML timings are on CPU (i9-12900K) and a single T4 GPU.

Larmor-50M (CPU)
70 ms
Larmor-100M (CPU)
120 ms
Larmor-100M (T4 GPU)
18 ms
DetaNet
340 ms
CASCADE
580 ms
NMRgnn
410 ms
DFT B3LYP/6-31G(d,p)
312.0 s
DFT PBE0/def2-TZVP
1840.0 s
Per-molecule prediction time, log scale (10 ms → 10000 s)
Figure 3. Per-molecule prediction time on a log scale, spanning over 4 orders of magnitude. Larmor is approximately 10,000× faster than DFT B3LYP and 100,000× faster than DFT PBE0/triple-zeta.

5.Ablation Studies

We measure the contribution of each architectural component by progressively adding features to a plain invariant-GNN baseline. Each row reports 1H MAE on the validation set after retraining for 50 epochs with identical hyperparameters except for the indicated change.

Configuration1H MAE (ppm)Δ
Invariant GNN baseline (no geometry)0.298-
+ E(3)-equivariant attention0.211−0.087
+ Distance-decay edge features0.182−0.029
+ Solvent conditioning token0.166−0.016
+ Hybridization-aware atom embeddings0.148−0.018
+ Multi-task pretraining (1H+13C+15N+19F)0.139−0.009
+ Conformer ensemble (top-5 ETKDG)0.131−0.008
Single biggest gain: equivariance
The shift from invariant to E(3)-equivariant attention contributed 0.087 ppm - by far the largest single improvement. Through-space anisotropy effects (ring currents, magnetic susceptibility) are inherently geometric and cannot be recovered from invariant graph features alone.

6.Error Analysis

We bucketed the 1,727 test shifts by chemical environment and measured MAE within each bucket to identify systematic failure modes.

Aromatic & heteroaromatic
n=412
0.082 ppm
Aliphatic CH/CH2/CH3
n=689
0.094 ppm
α to carbonyl / heteroatom
n=245
0.118 ppm
Vinyl / alkenyl
n=187
0.143 ppm
Strained ring (cyclopropyl, cyclobutyl)
n=73
0.198 ppm
Acidic OH / NH (exchangeable)
n=121
0.351 ppm

Two failure modes dominate the long tail. Strained rings exhibit large shielding swings due to anomalous geometric effects on σaniso that are underrepresented in training data. Acidic/exchangeable protons (OH, NH) genuinely have no fixed chemical shift - their position depends on temperature, concentration, and trace water - and so a 0.35 ppm MAE is close to the experimental reproducibility floor.

7.Applications in the Rasyn Platform

Larmor powers three production features inside Rasyn:

verified

Spectra Back-Calculator

Given a measured 1H NMR peak list and a candidate structure, Larmor predicts the expected spectrum, runs Hungarian alignment against the observation, and emits a structural-consistency verdict in <2 seconds.

filter_alt

Retrosynthesis pre-screening

Each candidate route from RetroTransformer v2 is scored by predicting the NMR of all intermediate products and flagging steps where measurement-vs-prediction disagrees beyond 3σ. Catches isomerization and rearrangement byproducts upstream of the LC-MS step.

manage_search

Structure elucidation copilot

Given an unknown spectrum and a SMILES candidate set, Larmor ranks candidates by the likelihood their predicted spectrum matches observation under realistic noise. Now used by 7 medicinal chemistry teams as a sanity check before committing to a structural assignment.

8.Conclusion

Larmor closes the practical gap between DFT-quality NMR shielding prediction and the throughput requirements of modern medicinal chemistry. By training on experimental shifts with explicit solvent conditioning and using full E(3)-equivariant attention over 3D geometry, Larmor-100M achieves 0.131 ppm 1H MAE and 1.04 ppm 13C MAE on nmrshiftdb2 - surpassing both DFT B3LYP/PBE0 + GIAO and all prior ML methods at ~104× the speed. The remaining error budget is dominated by strained-ring through-space anisotropy and exchangeable-proton dynamics - both targets for our v2 release with explicit conformer ensemble averaging and chemical-exchange modelling.

Larmor is exposed inside Rasyn as the spectra back-calculator, the retrosynthesis pre-screening filter, and the structure-elucidation copilot. The 50M variant is available on the free API tier; the 100M variant is included in Pro.

Run Larmor on your structures

Upload a SMILES and a measured peak list - get a verdict in seconds.

More researcharrow_forward