DataX – Julian Gold: making the leap from pure mathematics to computational biology

Jan. 5, 2024

Julian Gold had spent most of his academic research career in the world of pure mathematics. 

As a researcher, Gold used probability theory, an area of mathematics concerned with studying chance events or random phenomena. In particular, he focused on understanding the properties of mathematical objects that are characterized by disorder and randomness.

Though intrigued by the problems in his field, over time Gold found himself drawn to the connections between probability theory and data science, machine learning, and biology.

Now, Gold is a data scientist at Princeton University, where he is applying his background in pure mathematics and probability to computational biology and bioinformatics – fields that use computational methods to analyze enormous and complex data sets. Through Princeton's Schmidt DataX Initiative, he is part of a team developing tools for understanding growing tissue.

“Going from probability theory to computational biology may seem like a major leap,” said Gold, who started at Princeton in the spring semester of 2023. “Probability theory is a toolkit for studying randomness, and for handling incomplete information. The biological processes that give rise to our data are intrinsically random, and even our most careful measurements produce very sparse data.”

From Math to Mapping Growing Cells

As part of DataX, Gold is working closely with Ben Raphael – professor of computer science and a leading researcher in computational biology. Working with Raphael, as well as Ph.D. candidates Peter Halmos and Xinhao Liu, Gold’s team develops methods for integrating spatial transcriptomics data. Spatially resolved transcriptomics (SRT) is a new technology which allows researchers to chart gene expression on a per-cell basis. At the same time, the method also records the spatial locations of these cells within a given tissue slice.

Most recently, Gold’s team introduced a tool which takes tissue slices and aligns them in a time series by mapping ancestor cells to their descendant cells. “We can use our tool to quantify the growth rates of different cell types,” said Gold. “We can even use it to infer these cell types if they’re unknown.” Such tools could be applied to healthy tissue to better understand development, or to a tumor sample to inform treatment. Other projects out of Raphael’s lab have previously included the development of algorithms that scan cancer genomes to find aberrations.

Data map showing the patterns of tissue growth in a mouse kidney
Inferred patterns of tissue growth in a mouse kidney. Image courtesy of Julian Gold and Raphael Lab using data courtesy of the Ding Lab at Washington University School of Medicine in St. Louis, Division of Oncology, and the McDonnell Genome Institute.

Princeton announced the creation of DataX in February 2019 with the aim of spreading and deepening the reach of data science and machine learning across campus. Its mandate includes hiring data scientists to participate in three research area including biomedical data science, which Gold is a part of. The Department of Computer Science, the Lewis-Sigler Institute for Integrative Genomics (LSI), Princeton Neuroscience Institute, and several engineering departments take part of DataX’s biomedical science initiative.

Before coming to Princeton, Gold earned his bachelor’s degree in mathematics from University of California, Davis. He then moved to University of California, Los Angeles (UCLA) where he completed his doctoral training in mathematics under Prof. Marek Biskup. His thesis concerns statistical mechanics and random networks. Even in Gold’s thesis work, there was a hint of what would come to be later down the line. “Those models exhibited spatial correlations akin to those we wish to detect in spatial transcriptomics data,” Gold explained. 

After obtaining his PhD, Gold was then hired as an NSF postdoctoral research fellow at Northwestern University, where he was mentored by Antonio Auffinger, a mathematics professor and an investigator at Northwestern’s NSF-Simons Center for Quantitative Biology. Working under Auffinger, Gold studied spin glasses – mathematical objects which are traditionally non-magnetic alloys interspersed with a small fraction of magnetic atoms. Gold said spin glasses are models that display two important aspects the same as genomic data: they are high-dimensional and they are complex.

In addition to research, Gold also taught classes for Northwestern undergraduate and graduate students, co-organized the Northwestern Probability Seminar and developed an introductory math course for Northwestern’s Prison Education Program.

“I am very enthusiastic about my role at Princeton,” he said. “This is a place where, historically, connections between people in seemingly disparate fields have driven transformative research. The position enables me to collaborate with and learn from great people, all in the process of contributing to a dynamic and competitive field.”