As an increasing number of research groups across the disciplines find themselves relying on data-driven predictive insights only possible through machine learning applications, a nagging problem has arisen.
In many cases, when graduate students or post-docs develop code, they are focused on getting their own research finished and published as quickly as possible. When they inevitably depart, their understanding of the software leaves with them. This cycle sets the stage for technical debt, the term used to describe how a programmer’s design short-cuts and lack of documentation can cause problems (especially a lack of modularity and maintainability) to crop up down the road.
The solution? “We feel that continuity is key,” explains Ian Cosden, Manager of Princeton’s Research Software Engineering group. “We realized that having a dedicated software engineer in a permanent position ensures that an in-depth understanding of important code bases will not be lost when the initial developer leaves.”
Enter Vineet Bansal. A programmer with fifteen years of experience in all phases of the software development process, Bansal has joined Cosden’s group in a new position jointly sponsored by OIT’s Research Computing department, Princeton Institute for Computational Science (PICSciE) and Engineering and the Center for Statistics and Machine Learning (CSML)
“Researchers at Princeton University work at the cutting edge in their fields, and I simply couldn't pass up the opportunity to play my part in this amazing journey of scientific discovery,” Bansal explains. “I'm excited to potentially collaborate on a number of projects. For example, Barbara Engelhardt’s group has written algorithms that search for connections between variations in genomic data and higher-level human traits such as height, with the long-range goal of identifying the specific markers controlling the mechanisms of disease. The better the code, the faster it runs, and the greater the chances are of finding the needle in the haystack, so to speak.”
Regardless of the particular research group’s focus, says Bansal, his aim will be to develop new codebases that are well documented, reliable and scale well to the massive computing resources available to us on campus. “This is something that also applies to existing software; making it robust and amenable to widespread adoption is a top priority.”
Says Ian Cosden, “Vineet’s involvement in programming endeavors will dramatically increase each project’s institutional memory and bring software industry ‘best practices’ to research groups. The end result of having more robust, maintainable, and extensible software, he adds, is that researchers will have access to computational tools that can be cornerstones of their work for years to come.”
The variety of projects was a huge draw for him when considering the position, notes Bansal. “Princeton Research Computing provides and nurtures a great environment where software developers and analysts work collaboratively to bring the most advanced computing power to the Princeton community. CSML is a melting pot for researchers across disciplines who are pushing the boundaries of both the theory and applications of Machine Learning across a broad range of areas. I could be working with a Neuroscientist one semester and a Geologist the next—what's not to love?”
An ardent tennis player who loves world travel (“New Zealand is next!”) Bansal says one thought guides him when he writes software: “I’m constantly thinking,” he explains, “about how someone is going to use it five years in the future.”