Artificial intelligence evaluates chemical spectra in minutes
Researchers from the University of Jena, Helmholtz-Zentrum Berlin, and Zakodium develop AI system for chemical structure analysis
Determining which substance has actually been produced in a test tube or flask is one of the central tasks of chemistry. Particularly in the case of complex or novel compounds, however, this can be extremely time-consuming, even for experienced specialists. A research team from Friedrich Schiller University Jena, Helmholtz-Zentrum Berlin for Materials and Energy, the Helmholtz Institute for Polymers in Energy Applications Jena and the Swiss software company Zakodium Sárl has now developed an artificial intelligence (AI) system that proposes suitable molecular structures from the raw data of spectroscopic measurements and assesses their plausibility. The system is openly accessible and has been presented in the journal Nature Communications.
Why structure elucidation is so challenging
“Anyone who synthesises a molecule must also prove its chemical structure,” says Dr Kevin Jablonka of the University of Jena. He adds: “To do this, chemists typically use analytical techniques such as nuclear magnetic resonance (NMR) spectroscopy, infrared spectroscopy and mass spectrometry. Each of these methods provides clues about the structure, but often only to a limited extent. The many individual measurement signals therefore form a kind of chemical puzzle that must be solved correctly.”
Structure analysis is often particularly challenging for novel molecules that have never been described before, especially because measurement data obtained in practice are frequently less than ideal. “Impurities in a substance can generate their own signals or overlap with the signals of the target compound,” explains Jablonka. “This is where our system has a particular strength: for proton NMR spectra, which are routinely measured very frequently, it can cope with impurities present in real samples.”
How SECS works
“The new system, SECS, combines two artificial intelligence approaches,” explains Adrian Mirza, first author of the study. “First, the model learns to translate spectra and molecular structures into a shared mathematical representation. An evolutionary algorithm then refines the results by modifying candidate molecules step by step, adding or removing atoms and bonds and repeatedly testing whether the outcome provides a better fit to the measurement data.”
The system ultimately produces a ranked list of possible structures, together with similarity scores based on the chemical context.
Comparable with experienced specialists
“In a benchmark test involving different spectroscopic methods, SECS identified the correct molecular structure as its top-ranked prediction in more than 80 per cent of cases,” says Jablonka, describing the system’s performance. It also proved capable of matching human experts in direct comparison. “In a pilot study, we asked chemists to solve 20 challenging NMR structure-elucidation problems,” Jablonka explains. The result: the AI achieved a level of performance comparable to that of the participating specialists.
“However, we do not see SECS as a replacement for human expertise,” Mirza emphasises. “The system can provide a highly useful second opinion.” If the proposed structures are plausible and receive high scores, this reinforces confidence in the interpretation. “If, on the other hand, the suggested structures differ substantially from the expected molecular structure, it may be worth taking a closer look,” Jablonka adds.
An open tool for research
The source code, model data and a test version of the application are publicly available. According to information provided during discussions with the researchers, the current web version is primarily designed for the direct evaluation of one-dimensional proton NMR raw data. Support for additional spectral types and more complex raw datasets is planned for future releases.
Original publication
Most read news
Original publication
Adrian Mirza, Luc Patiny, Kevin Maik Jablonka; "End-to-end multimodal structure elucidation from raw spectra combining contrastive learning and evolutionary algorithms"; Nature Communications, Volume 17, 2026-6-5
Topics
Organizations
Other news from the department science
Get the analytics and lab tech industry in your inbox
By submitting this form you agree that LUMITOS AG will send you the newsletter(s) selected above by email. Your data will not be passed on to third parties. Your data will be stored and processed in accordance with our data protection regulations. LUMITOS may contact you by email for the purpose of advertising or market and opinion surveys. You can revoke your consent at any time without giving reasons to LUMITOS AG, Ernst-Augustin-Str. 2, 12489 Berlin, Germany or by e-mail at revoke@lumitos.com with effect for the future. In addition, each email contains a link to unsubscribe from the corresponding newsletter.