Skip to content

zeitge/loom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LOOM

LOOM is a desktop NMR analysis workstation written in Rust. It takes raw Bruker TopSpin (or JDX/JCAMP-DX) 2D spectra, filters them with an adaptive SVD-based phase-space filter, picks peaks, and assembles a molecular graph via a CASE (Computer-Assisted Structure Elucidation) pipeline. The result is a live molecular profile — atomic formula, DBE, functional groups, and a table of J-coupled multiplets — that updates as you load data.


Screenshots

¹H-1D view — multiplet deconvolution with ACS-format output and J-coupling extraction:

1H-1D ibuprofen

HMBC view — filtered 2D spectrum with the assembled molecular profile (C₁₃H₁₈O₂, DBE 5 = 5):

HMBC ibuprofen


Features

Category Details
Spectral filter Adaptive Von Neumann phase-space filter with Gavish–Donoho rank selection; 20–50× faster than a naive SVD sweep via adaptive early-exit and parallel row processing
2D experiments HSQC / HMQC, COSY / TOCSY, HMBC, NOESY / ROESY — each handled as a typed workspace
1D experiments ¹H-1D multiplet deconvolution (s / d / t / q / dd / m with J-coupling extraction); ¹³C-1D singlet picking
Peak picking Blob detection on the filtered matrix; volume integration; proton count from companion ¹H-1D cross-projection
CASE engine ¹³C-1D anchors the carbon skeleton; HSQC attaches proton shifts; HMBC adds long-range edges; COSY confirms short-range bonds; NOESY adds spatial restraints; all sources contribute independent confidence scores per node
Solvent exclusion Automated detection of CDCl₃, DMSO-d₆, D₂O, MeOD, acetone-d₆, CD₂Cl₂ and suppression of their residual signals from peak lists, formula, and proton counts
Heteroatom inference Infers O, N, S from exchangeable protons, carbonyl classification, and α-heteroatom carbons; refines the molecular formula accordingly
Formula & DBE Unicode molecular formula (C₁₃H₁₈O₂); degree of unsaturation computed from both formula and observed structure; colour-coded agreement badge
ASV Automated Structure Verification — paste a SMILES string, score it against the experimental NMR graph
Export Filtered spectrum written to TopSpin pdata/999 (axes preserved, i32 LE binary); peak list as CSV
File formats Bruker 2rr / 1r + procs / proc2s; JDX / JCAMP-DX (1D, 2D, peak table variants)
Auto-scan One-click project-folder scan: pulse-programme heuristics assign roles automatically (HSQC, HMBC, COSY, NOESY, ¹H-1D, ¹³C-1D) and populate a sequential import queue

Building

Requires Rust stable 1.75+ and a desktop session (X11 or Wayland on Linux, native on macOS / Windows).

git clone <repo-url>
cd loom
cargo build --release
./target/release/loom

The window opens at 1280 × 800 px and is freely resizable.


Quick-start workflow

  1. Load → Project folder (auto-scan) — point to the root of a Bruker experiment series. LOOM scans every numbered subfolder, identifies experiment types from PULPROG, and presents a checklist. Click Start sequential import to queue them all.

  2. Alternatively, load individual spectra via Load → HSQC / COSY / HMBC / 1H-1D / 13C-1D.

  3. Use the Crystallization slider (right panel, 0–100 %) to control how aggressively the filter extracts signal from noise. 50 % is the default; go higher for cleaner spectra, lower to preserve weak cross-peaks.

  4. The View tabs in the menu bar switch between loaded experiments; the right panel updates to show the relevant analysis for the active view.

  5. Export to TopSpin writes pdata/999 next to the original pdata folder so TopSpin can display the filtered spectrum as a separate dataset.


Reading the molecular profile

🧬 Molecular profile
C₁₃H₁₈O₂                      ← molecular formula
✓ DBE: 5 formula · 5 structure ← green = consistent
• 8 C protonated · 5 C quaternary
• 1 carbonyl (C=O)
• 1 aromatic ring(s)

Correlations (COSY · short)
  7.20 ↔ 7.10 (Δ= 0.10 ppm)

Multiplet deconvolution
  7.20 (d, J = 7.9 Hz, 2H)
  7.11 (d, J = 7.9 Hz, 2H)
  3.63 (q, J = 7.1 Hz, 1H)
  …

The DBE badge is the primary diagnostic. A red means either the formula is missing a heteroatom (typically O or N) or a ring / double bond is not captured in the observed structure. This is a signal that more data is needed — for example, HMBC with a longer mixing time to see through heteroatoms, or DEPT to confirm quaternary carbons.


Architecture

src/
├── main.rs            entry point, eframe bootstrap
├── types.rs           shared data structures (single source of truth)
├── filter.rs          LOOM filter — adaptive SVD, parallel via rayon
├── analysis.rs        peak detection, multiplet clustering, integration
├── multiplet_engine.rs 1H spin-system classification (s/d/t/q/dd/m)
├── case.rs            CASE engine v5 — molecular graph assembly
├── heteroatom.rs      heteroatom inference and formula refinement
├── solvent.rs         solvent detection and masking
├── validation.rs      cross-validation and consistency warnings
├── asv_engine.rs      Automated Structure Verification (SMILES scoring)
├── smiles.rs          SMILES parser → graph
├── bruker.rs          Bruker TopSpin I/O (pdata read/write)
├── jdx.rs             JDX / JCAMP-DX reader
├── app.rs             application state + background worker thread
└── ui.rs              egui rendering (presentation only)

tests/
├── ibuprofen_pipeline.rs   end-to-end integration test (synthetic ibuprofen dataset)
└── benchmark.rs            filter performance regression test

Data flow: bruker / jdxSpectrum2D / Spectrum1D → worker thread (filter + analysis) → FilterResultMessageLoomApp state → rebuild_graph_if_ready (CASE + heteroatom + validation) → ui renders.

The worker thread holds an Arc<Spectrum2D> and sends results back over a crossbeam channel; rapid slider changes coalesce — only the last pending job per role is executed.


Running the tests

cargo test

The integration suite synthesises a complete ibuprofen dataset (¹H-1D, ¹³C-1D, HSQC, COSY, HMBC with CDCl₃/DMSO-d₆ residuals) and verifies that the pipeline recovers the correct formula (C₁₃H₁₈O₂), DBE (5), and carbon count across a range of edge cases.


Dependencies

Crate Purpose
eframe / egui immediate-mode GUI
rayon data-parallel row processing in the filter
faer SVD and matrix operations
byteorder Bruker binary file I/O
crossbeam-channel worker thread communication
rfd native file dialogs

Known limitations

The pipeline is validated against ibuprofen (C₁₃H₁₈O₂, DBE=5) and a handful of other small molecules. It does not reliably handle all structures yet — the specific failure modes are documented below.

Formula and atom counting

  • Heteroatom inference (O, N, S) is heuristic: it works by recognising exchangeable protons, carbonyl chemical shifts, and α-heteroatom carbons. For molecules where the heteroatom signal is ambiguous or absent (e.g. ethers with no nearby ¹H, or tertiary amines), the formula will be under-counted.
  • Without a ¹³C-1D experiment, all carbons must be inferred from HSQC alone. Quaternary carbons (C=O, aromatic ipso, quaternary sp³) are invisible to HSQC, so the carbon count will be low and the formula will be wrong.
  • Proton counting depends on ¹H-1D integral ratios. Strongly overlapping multiplets (common in natural products and larger molecules) can confuse the integral normalisation, producing incorrect nH assignments.

Peak picking and multiplicity

  • The multiplet classifier uses fixed intensity-ratio tolerances (±0.35 of the ideal Pascal's-triangle ratio). Broad or slightly asymmetric peaks — caused by conformational exchange, solvent effects, or imperfect phasing — may be classified as m (multiplet) even when the coupling pattern is clear by eye.
  • The t1-noise column filter removes HSQC and HMBC peaks that appear across more than ~5 % of ¹H columns. In crowded spectra this can occasionally suppress a real correlation if it happens to align with a noise ridge.

Molecular size and complexity

  • Tested primarily on molecules with MW < ~400 Da. Larger molecules with many overlapping signals in the aliphatic region (0–2 ppm) tend to produce inflated peak counts and unreliable fragment assembly.
  • The aromatic-ring detector looks for sub-rings where ≥1 five-to-seven-membered ring has aromatic δC values. Fused polycyclic systems and heteroaromatics (pyridine, furan, imidazole) are often detected as a single aromatic fragment rather than individual rings.

Structure verification (ASV)

  • The ASV engine scores a candidate SMILES against chemical-shift windows; it does not enumerate isomers or perform full COSY/HMBC connectivity matching. A high score means the chemical shifts are plausible, not that the structure is correct.

To-do

  • DEPT/DEPT-135 support — definitive CH / CH₂ / CH₃ / C assignment without relying on HSQC volume ratios
  • Robust proton counter for overlapping multiplets (numerical deconvolution instead of integral ratio comparison)
  • Improved heteroatom inference for ethers, tertiary amines, and halides
  • Full COSY/HMBC connectivity matching in ASV (graph isomorphism, not just shift window scoring)
  • Automatic chemical shift referencing (TMS / DSS / solvent residual anchoring)
  • macOS and Windows build verification (currently only tested on Linux/X11)

License

GPLv3

About

A desktop NMR analysis workstation written in Rust. Features adaptive SVD-based phase-space filtering, automated multiplet deconvolution, and a CASE pipeline for live molecular structure elucidation from 1D/2D Bruker and JCAMP-DX spectra.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages