It was a productive winter break for attendees of Princeton University’s virtual two-week computing bootcamp, which had over 300 registrants and covered a wide range of subjects, from foundational skills on programming languages to high performance computing.
This was the first time the Princeton Research Computing Bootcamp, in its third year, was expanded to two weeks after growing in popularity since its inception in 2018 as a four-day event. The bootcamp, which was held from January 19 to 29, drew mostly graduate students and postdocs but also a few undergraduates and faculty members. Participants hailed from 37 departments, including Princeton Plasma Physics Laboratory and the Institute for Advanced Study.
Princeton Institute for Computational Science & Engineering (PICSciE), OIT Research Computing, and the Center for Statistics and Machine Learning (CSML) sponsored the bootcamp. The first bootcamp had about 80 participants while last year’s event had 130 registrants.
“I see the bootcamp as a critical part of the education and training of our graduate students and postdocs,” said Jeroen Tromp, director of PICSciE and Blair Professor of Geology and Professor of Geosciences and Applied and Computational Mathematics. “This is where they get exposed to best practices in research computing. What they learn and become aware of during this bootcamp will be of tremendous long-term value for the rest of their careers as researchers. I think most students sense the importance of acquiring these skills, which is why the camp is so popular.”
The first week was spent on the essentials of research computing with short courses on topics such as Linux, cloud basics, programming languages such as Python, data visualization, and research software engineering. The second week focused on high performance computing with courses on parallel programming and floating point arithmetic. The week ended with an all-day session on GPU programming on the NVIDIA platform on Thursday and then an all-day session on the fundamentals of deep learning on Friday.
Some bootcamp sessions were delivered by experts from Intel and NVIDIA, but most were volunteers from the broader Princeton community. Local instructors included a mix of faculty members; personnel from PPPL, PICSciE and CSML; data scientists from Princeton’s DataX initiative, parts of which are overseen by CSML; and even one undergraduate MATLAB Student Ambassador.
“We're trying to meet the researchers basically where they are,”said Gabe Perez-Giz, a research software and computing training specialist at PICSciE, on why the bootcamp has become popular. “We saw a pretty diverse range of people, who had varying levels of skill and experience, because we've expanded the events and started to offer some sessions at different levels, like more entry level courses and then more intermediate level courses. We're really seeing a gamut of people from across campus from different departments and from different stages of their careers.”
Adelle Wright, a postdoctoral fellow at Princeton Plasma Physics Laboratory, said she found the bootcamp helpful. Wright’s research involves performing simulations and developing new theories to understand the macroscopic behavior of plasmas in magnetic confinement fusion.
“Advanced numerical simulations and high performance computing are essential components of modern fusion science and plasma physics research,” said Wright. “So, apart from the mechanical skills, understanding the design philosophies and strategies which underpin best practice in modern day computing and software development was among the most valuable take-aways from the bootcamp.”
Zachary Atkins, a graduate student in the physics department, also found the bootcamp to be helpful and also eye opening because it revealed further to him the resources available on campus. Atkins’ research focuses on cosmology experiments, mainly data analysis for the Atacama Cosmology Telescope.
“I'd previously gained most of my programming and cluster-use experience informally. Besides a couple computer-science courses in undergrad, I've had to pick up my skill set on the fly. I was looking for a dedicated space that would more formally teach me the real, down-to-earth skills and best practices for being a computational researcher,” said Atkins.
“It was exciting to learn more about and experience the amazing computational support staff at the University. It's really a privilege to have these dedicated people and resources available to us,” added Atkins.
Due to the virtual format, it was hard to say which parts of the bootcamp the students were most engaged with, according to a few of the instructors.
Daisy Huang, a CSML lecturer who taught two bootcamp courses, one on data visualization and the other on reproducible reports with R markdown, said she missed the in-person interactions that a traditional classroom offers, but a plus was that “the audience can see my computer screen more easily--this was particularly helpful for coding workshops.”
Instructors tried to keep the courses lively with various tactics. Vineet Bansal, who’s jointly appointed as senior research software engineer to PICSciE and CSML, said he broke up his class at regular intervals with short exercises as well as quizzes. Bansal taught an introductory class on Numpy, the underlying numerical library used extensively in Python.
Carolina Roe-Raymond, visualization analyst at PICSciE, taught a course on how to make effective plots. She said participants seemed to be most engaged over graph types most associated withthe ability to interpret data accurately.
Jonathan Halverson, research software and programming analyst at PICSciE, taught an introductory class on software testing. “They were most engaged over how the quality of their production software improves by writing tests. Testing gives the developer more confidence and it promotes writing robust and more readable code,” he said. “There was also excitement about pushing code changes to a GitHub repository and having the tests run automatically so that it is known that the new code did not introduce bugs.”
Brian Arnold and Andrzej Zuranski, data scientists with DataX, taught an introductory course on data analysis using the R programming language. Their students ranged the gamut in disciplines from sociology to biological engineering.
“It’s not surprising to see such diversity because anybody doing research, even bench scientists, will want to analyze their data at some point,” said Arnold.
“Because it’s an introductory course, we felt they overcame the challenge of how to start programming and how to make code that does something useful,” said Zuranski. “We did not go into advanced subjects but they saw how they could do simple data analysis with just a few simple steps.”
As for the future, the organizers are planning to do a post-mortem on the bootcamp’s scheduling and to look at the possibilities for future iterations of the workshops.
“There's definitely been interest from some departments to have a 'lite' version of bootcamp in August,” said Perez-Giz. “Going forward, we may separate the ‘fundamentals’ and ‘high performance computing’ weeks into two separate events at different times of the year.”