Friday, May 31, 2019
by Sharon Adarlo
The “uhs” and “ums” that pepper conversational speech can seem rather unimportant. Transcribers don’t often write down these vocal interruptions when translating interviews with people or quotes for the written page. But those filler words can carry a lot of meaning, according to Marie-Fee Breyer, Princeton University senior.
These words and other hesitation sounds can convey context, emotion and meaning, said Breyer. Recording them can increase the effectiveness of automatic speech recognition programs and create “richer speech transcripts, as fillers can carry communicational value.”
Breyer, a computer science major, set out to create a neural network to recognize filler words, which she presented along with more than 60 students at the Center for Statistics and Machine Learning’s (CSML) annual poster session on May 14th at the Carl A. Fields Center. From Breyer’s inventive filler word detection program to projects exploring 3D printing, eviction rates across the country, and to anonymizing data sets, the poster session was an exciting display of the creativity, rigor and the diversity of research that the center has fostered.
Students’ independent projects on display were done in completion of CSML’s Undergraduate Certificate Program in Statistics and Machine Learning. A total of 62 students participated and came from a wide section of campus: computer science, operations research and financial engineering, sociology, politics and economics - just to name a few.
Faculty members named Gene X. Li, class of 2019, to be the winner of the poster session for his project, “Learning Linear Dynamical Systems with Sparsity Structure.” Li majored in electrical engineering and his adviser is Yuxin Chen, an assistant professor of electrical engineering. An article featuring Li and his project are forthcoming.
The undergraduate CSML certificate program was started in 2013 and is open to all Princeton students who have an interest in the theory and applications of data science, statistics and machine learning. Last year, 60 students presented at the year-end poster session. The year before that, 35 students presented.
“It was a very dynamic event, very well-attended, lots of energy, lots of great work,” said Peter J. Ramadge, the CSML director, the Gordon Y.S. Wu Professor of Engineering and professor of electrical engineering. “There are more students and there’s a broader range of topics compared to last year. The students are using the tools they have learned during CSML-related courses and asking very interesting questions.”
Ryan P. Adams, CSML undergraduate director and computer science professor, also applauded the students’ efforts.
“It’s a very fun event. The poster session highlighted how students who enroll in the certificate can do their independent project, from studying social networks to fundamental research on machine learning,” he said. “There is a lot of creativity around the research topics and the ways people have found data sets.”
Breyer herself had to generate her own data sets for her project by culling interviews from NBC News and MSNBC.
“This project was great for me because it gave me the opportunity to work on the full lifetime of a project, from starting with no data to the final results,” she said. “It’s really cool to see an application used in real life situations. And it’s great to see that the skills I have learned the past four years be put to use.”
Nicholas Johnson, an operations research and financial engineering major and class of 2020, set out to anonymize data sets while utilizing it for downstream research, a common problem companies have encountered.
“This project is my first independent project and it has definitely been the most rewarding thing I have done,” said Johnson. “I definitely like the fact we have CSML on campus because it gives a framework to all the machine leaning courses that are being taught across the various departments. Having CSML has been helpful too because they encourage student independent work.”
The CSML certificate has been directly helpful to Rachana Balasubramanian, computer science major and senior. After her graduation, she’s taking up a job to be a machine learning developer at the bank, Capital One. Her independent project used machine learning techniques to parse through Yelp reviews and figure out which ones are the most helpful.
“It was a fun project and I learned so much,” she said. “I don’t think I would have taken as much machine learning classes if not for CSML. The certificate program definitely prepared me for the future.”