Princeton Data Science Club: DataDebut (Part One)

Monday, Apr 26, 2021
by Sharon Adarlo

With the rise of machine learning and data science as essential tools across the Princeton University campus, a group of students has gathered to revive the Princeton Data Science (PDS) club as a way to provide more opportunities to learn about this discipline.

The club, sponsored by the Center for Statistics and Machine Learning (CSML), has been around for about four years. According to club organizers, the club recently reorganized to better meet student needs of exposure, education, and networking opportunities in data science.

"Our mission is to promote data science on campus in all ways possible. We want to introduce students to the exciting world of data science and show how they can start applying these techniques to problems they care about," said Arjun Mani, PDS president.

The club has 16 officers, more than 400 students who subscribe to its active listserv, and three programming branches: DataDebut, Data@, and DataDev.

  • DataDebut organizes student-led workshops and professor talks to introduce methods and applications of data science.  
  • Data@ invites guest speakers from industry and outside research groups to share how they use data science in their work. 
  • DataDev organizes, trains, and hosts student teams to compete in data science and machine learning competitions and work on collaborative projects.

In this first of three short articles about PDS, we feature DataDebut. In subsequent articles, we will feature Data@ and DataDev.

Nab Kar - a senior in the Department of Operations Research and Financial Engineering (ORFE), PDS treasurer, and team lead for DataDebut - said the club offers introductory workshops on data science to students at least once a month. The workshops do not require a technical background. Anybody with interest in data science is welcome to attend.

"They have been fun to host, and the workshops have had a terrific turnout," said Kar. "It's been great to engage with students and teach them something I'm passionate about."

The workshops offer an introduction to data science and provide students an opportunity to work on real-world hot button issues. Past workshops have involved modeling the spread of COVID-19, analyzing the wildfires happening in California, and studying gerrymandering.

The second initiative within DataDebut is a speaker series featuring Princeton professors, said Laura Fang, a sophomore in ORFE and a DataDebut executive team member.

"The main goal of this series is to introduce students to the different ways of applying data science in various atypical fields," said Fang. "When people think of data science, they usually think of computer science, math, or physics. And one of the goals we had for this initiative was to show people that data science can be applicable in many of other ways."

A past talk featured Michael Oppenheimer, the Albert G. Milbank Professor of Geosciences and International Affairs, and the High Meadows Environmental Institute. Oppenheimer is known for his research on understanding the potential for "dangerous" outcomes of increasing greenhouse gas levels. He explores the effects of global warming on the ice sheets and sea level, the risk from coastal storms, and human migration patterns.

"Professor Oppenheimer gave a great talk on how he uses data science and machine learning to examine the effects of extreme heat, cyclones, and migration patterns," said Fang.

Future talks include Uri Hasson, professor of psychology and neuroscience. Hasson utilizes data science to analyze brain patterns.

"We hope that more people will join us, learn more about data science, and see how it impacts our daily lives," said Kar. "We want a greater cross-section of students to be informed about data science."

For more information on Princeton Data Science, go to the club website at .