Language model-guided anticipation and discovery of unknown metabolites

Date
Nov 12, 2024, 12:00 pm1:30 pm
Location
Bendheim House 103

Speaker

Details

Event Description

Lunch is available beginning at 12 PM

Speaker to begin promptly at 12:30 PM

Abstract: Despite decades of study, large parts of the human metabolome remain unexplored. Mass spectrometry-based metabolomics routinely detects thousands of unidentified small molecules within human tissues and biofluids, but structure elucidation of novel metabolites remains a low-throughput endeavour. Here, we present an approach that leverages chemical language models to discover previously uncharacterized metabolites. We introduce DeepMet, a language model that learns the latent biosynthetic logic embedded within the chemical structures of known metabolites and exploits this understanding to anticipate the existence of as-of-yet undiscovered metabolites. Prospective synthesis of metabolites predicted to exist by DeepMet directs their targeted discovery. Integrating DeepMet with tandem mass spectrometry (MS/MS) data enables automated metabolite discovery within complex tissues. We demonstrate the potential for language models to accelerate the mapping of the metabolome by harnessinging DeepMet to discover several dozen mammalian metabolites. 

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

Sponsor
Center for Statistics and Machine Learning