DataX - Andrzej Żurański: using machine learning to study chemical reactions

Written by
Sharon Adarlo
May 29, 2020

Though he only joined Princeton University as a data scientist in November, Andrzej Żurański is already involved in some exciting research on campus: applying machine learning techniques to explore the mysteries underpinning chemical reactions. His contributions are aimed at discovering efficient methods for the synthesis of new, innovative compounds that have the potential to be used in medicine and industrial applications. 

“In our projects, we want to push machine learning to its limits and perhaps help inform the chemistry world on what might be happening in these reactions,” said Żurański. “It’s interesting, gratifying work.”

Żurański, who is originally from Poland, grew up training to be a classical pianist and thought about doing it professionally, but decided to go into STEM on a last-minute whim. He earned his bachelor’s and master’s degree in applied physics from Jagiellonian University in Krakow, Poland. Jagiellonian is a venerable institution that dates back to 1364 and is the oldest higher educational institution in Poland.

Żurański left for Princeton in 2008 to study for a doctoral degree in elementary particle physics. During his time as a graduate student, Żurański worked on CERN’s Large Hadron Collider, the most powerful and largest particle accelerator on Earth. It was also a period where he began exploring machine learning and data analysis, he said.

After earning his Ph.D. in 2014, Żurański joined Princeton Consultants, an information technology and management consulting company. There, Żurański performed data analysis and built statistical models for transportation industry clients until his recent hire as a data scientist under Princeton’s DataX initiative. 

Żurański’s addition to Princeton’s research community is supported by the Schmidt DataX Fund, which aims to spread and deepen the use of artificial intelligence and machine learning across campus to speed scientific discovery.  Princeton announced this new fund in February 2019, a gift of Schmidt Futures, with the Center for Statistics and Machine Learning (CSML) playing a role in shepherding the initiative. This fund will ultimately support six Schmidt Data Scientists - Żurański being one of them.

The mission of the data scientists is to spearhead developing and improving data-analyses to enable more impactful research in three areas on campus: the Princeton Catalysis Initiative, the biomedical data science initiative, and the Center for Information Technology Policy (CITP). 

Żurański is part of the Princeton Catalysis Initiative, specifically assigned to the lab of Abigail Doyle, the A. Barton Hepburn Professor of Chemistry. Doyle’s lab focuses on developing new ways to create useful molecules and utilizing conventional and inexpensive catalysts for reactions.

“Princeton has been an innovator in synthetic chemistry research and in the development of new methodologies in machine learning and statistics,” said Peter J. Ramadge, CSML director. “The addition of Żurański, who has both academic and industrial expertise, helps strengthen the link between these traditions.”

“The students see me as a resource for technical knowledge or experience with programming and machine learning,” said Żurański about his role in Doyle’s lab.

In addition to supporting data scientists like Żurański, DataX is funding a variety of research initiatives through the Center for Statistics and Machine Learning. These initiatives include support for new research projects, developing a graduate-level course in data science and machine learning, supporting research workshops, and training researchers.

Under the Princeton Catalysis Initiative, Żurański is helping model the reactivity of a Suzuki reaction, a chemical reaction with organic compounds and a metal catalyst, palladium, an element that is one of the rarest on Earth. Żurański said they are building a machine learning model to help predict ideal conditions for a Suzuki reaction. 

Żurański is also analyzing various synthetic molecules and computing their properties and descriptors for inclusion into a database. This initiative, Żurański said, would help streamline experiments to make new compounds, possibly reducing lab time and expense. 

“The mechanism underneath these reactions are not very well understood,” said Żurański. “People like to hypothesize how things happen and create experiments to see if it’s true, but that process can be very labor-intensive.”

Machine learning tools can model those reactions and predict optimal conditions and results, he said. It can guide chemists to optimize these reactions and elucidate the mechanisms behind them, while at the same time cutting down the number of dead-end experiments.

So far, Żurański has been relishing his time on working on these innovative projects in the lab with researchers and students, and excited about pushing forward machine learning’s possibilities in synthetic chemistry at Princeton.

“I was missing an academic environment,” said Żurański. “Here, on campus, we are working on cutting edge research and trying our best to make breakthroughs that we can share with others in the field. There is similar high-quality analysis happening all over the campus, which makes being here both fun and exciting.”