We recently caught up with CSML Independent Work Prize co-winners Advait Chauhan (’17 Computer Science) and Chitra Marti (’17 Economics) to chat about life after graduation and how their undergraduate certificate in statistics and machine learning is affecting their path.
Tell us what you’re each doing in the months since graduating.
CM: I’m working as a research analyst at the Federal Reserve Bank in San Francisco. Economics research relies heavily on statistics and it’s also moving towards incorporating more machine learning techniques. I use my stats knowledge on a daily basis; I’ve recently worked on using machine learning techniques to understand the composition of the labor force, and project impulse responses from vector autoregression models. I hope to eventually pursue a PhD. in Economics.
AC: I’m working as a growth software engineer at Facebook. Generally, a large part of growth involves experimenting with various initiatives, and evaluating the impact of these initiatives on target metrics, the latter of which relies on statistics. Machine learning is used throughout to optimize this process, for example to help determine which users should get what growth initiatives.
Congrats on winning the CSML Independent Work Prize earlier this year. Why did you choose your particular research topics?
CM: I basically picked my data before I picked my topic! A friend and I had entered the Yelp Dataset Challenge where Yelp publishes a huge dataset and asks people to submit interesting findings. We ended up taking a social scientific perspective, looking at how users might be motivated or not to write Yelp reviews based on what others had written before them. That idea blossomed into my independent work. Yelp has a literal 1-percent elite group of highly active users; I wanted to learn whether these “Yelp Elite” reviewers have an outsized impact on other users. It turned out that a category of users right below the top—a group I called the “Almost-Elite”—were actually more influential than the Elite.
AC: Abstractive summarization is essentially the task of paraphrasing a sentence, converting it to a new, typically shorter sentence that doesn't necessarily contain the same words as the original sentence. I noticed that attention-based sequence to sequence models, such as the one which recently achieved groundbreaking results in translating between languages at Google Translate, seemed naturally well suited to be applied toward the abstractive summarization task, but hadn’t been as thoroughly explored in this domain. I thus found an exciting opportunity.
Q: What role did your advisor play in your successful research project?
CM: My advisor, Ilyana Kuziemko of the Department of Economics helped me refine my question to a feasible yet interesting one; I had several disjointed ideas about how to proceed but she showed me how an Event-Time analysis suited my questions best. She guided me through all the econometric analysis and she also helped me think about control variables and how to construct my panel so I wasn’t double-counting reviews.
AC: I was very fortunate to have Sandra Batista in the Computer Science Department advising me. We met every week to discuss my project, and she was very helpful in overcoming various roadblocks and guiding me to new resources. For example, she suggested that I could explore in more detail the weaknesses of the evaluation metric used across the literature for abstractive summarization, when I mentioned this was a difficulty I was encountering, a suggestion which led to a shift of where my research focused. She was supportive throughout the entire project, and I'm so grateful for that.
Q: Now that you’re out of college, do you have any reflections on the ways in which your Princeton education shaped your relationship with what some are calling the big data technology revolution?
CM: As a Research Associate, my current job basically requires me to keep up with new trends and research in economics. In general, economics has been moving towards incorporating more sophistical statistics and machine learning techniques. I’m grateful that I was able to take so many different classes at Princeton that were related to, tangentially or directly, machine learning and big data topics. In particular, I’m glad I was able to take so many applied classes, where we could use what we were learning empirically.
AC: I got to take a lot of fundamental statistics/machine learning/AI coursework at Princeton, a foundation which I am very happy to have. I learn a lot about what is up-and-coming in technology and big data from friends and people around me—living in San Francisco is great for that!