Data Scientists in Biomedical Data Science - Schmidt Data X Project

D-19-COS-00006 | Computer Science | Specialist 

Schmidt Data X Project. Princeton seeks six Data Scientists, in three-year term positions, to create and improve data-analysis software to operate at large scale, leading to faster discovery, wider impact, and greater continuity.  Data Scientists will be part of larger research team composed of faculty, post-doctoral researchers, graduate students (two data scientists per area). The Data Scientists will report to lead faculty members for each project.  

The three targeted areas of research (Catalysis Initiative, Biomedical Data Science Initiative, Ctr for Info Tech. Policy) cut across a range of departments and interdisciplinary institutes on campus, maximizing the reach of the Data Scientists.  

*Biomedical Data Science: The biomedical data science initiative is spearheaded by the Dept of Computer Science, with strong connections to the Lewis-Sigler Institute for Integrative Genomics, Princeton Neuroscience Institute, and several other engineering departments. Analyzing biomedical data at scale remains challenging, requiring new algorithms, software-analysis pipelines, standard interfaces, protection of sensitive user data, and better ways to leverage public and private cloud computing resources.    

Specific responsibilities of the two Data Scientists in this group include a combination of the following:

  • Develop novel algorithms, machine learning approaches, and statistical techniques, and apply these methods to large repositories of biomedical datasets.
  • Develop robust, scalable and user-friendly software packages implementing novel methodology for use by external biomedical researchers.
  • Implement a shared infrastructure to facilitate seamless access to massive datasets, including analysis pipelines to standardize and normalize large public/private data sets.
  • Develop interfaces and virtual machines to access and analyze data on public and private cloud computing platforms.

While the specific responsibilities will vary by research project, all Data Scientists will create opportunities to educate, train, convene, and support a broad community of researchers on campus in how to best leverage data science in their research and teaching.  They will also contribute to new graduate-level courses on data science as well as mini courses, workshops, and office hours.  In all three areas, the Data Scientists must demonstrate expertise in researching, designing, and implementing algorithms and techniques to exploit the connections between data analysis/machine learning and the fundamental research questions explored by each group.


  • PhD required in computer science, data/computational science, or related disciplinary field or equivalent combination of educational training and relevant experience; advanced degree in disciplinary field (chemistry, life sciences, engineering, social sciences) strongly preferred
  • 5 - 9+ years working in data analysis/scientific computing role required
  • Knowledge of mathematical modeling and computational methods
  • Demonstrated experience applying AI and ML concepts and tools to research questions and projects, including modeling and simulation work
  • Strong coding and algorithm prototyping skills, as well as the ability to explain and document this work in accessible ways; expert knowledge of general purpose, dynamically typed object-oriented language such as Ruby or Python
  • Proficiency in SQL and database design and building data-driven web applications
  • Experience excelling in a highlight collaborative, multi-disciplinary research environment
  • Experience determining strategy and executing interdisciplinary projects strongly preferred
  • Experience as a Principal Investigator is preferred.
  • Demonstrated innovative technical achievements and/or extensive managerial experience preferred

Immediate openings for qualified candidates.  Use this link to apply:

