As machine learning gains traction in research and industry, one subfield, deep learning, has emerged as a hot area of interest due to rapid development and research, according to Boris Hanin, an assistant professor in Princeton University’s Department of Operations Research and Financial Engineering.
Deep learning, to put it simply, is programming computers to learn how to make decisions using structures modeled on the human brain, and is “perhaps the most dynamic part of modern machine learning,” said Hanin. “The field is moving so fast that it can be hard to distinguish exceptional research amidst the glut of new scholarship out there,” he added.
To train the next generation of talent and amplify the best deep learning theoretical research out there, Hanin, a group of senior academics and industrial practitioners from big tech companies taught and presented seminars to a cohort of graduate students, postdoctoral fellows and professors at the “Deep Learning Theory Summer School” at Princeton University. The event, held from July 27 to August 4, was sponsored by the Center for Statistics and Machine Learning, DataX and the Department of Operations Research and Financial Engineering.
Interest in the summer school was unexpectedly high with approximately 475 applications, composed of professors, postdocs, doctoral students, master’s degree students, and undergraduates. This year, over 90 percent of the applicants were from math, physics, computer science, statistics, and electrical engineering programs, said Hanin. Approximately 150 applicants were accepted into the summer program.
The next summer school is scheduled to be in person for next year with February or March 2022 as next application period, said Hanin.
“We are pleased that the summer school was popular and proved useful to so many people,” said Peter Ramadge, CSML director. “In order to advance the field of deep learning, it’s essential that there is a sharing of knowledge and active collaboration in the data science community, and the summer school is an important step towards accomplishing that objective.”
People who attended the special seven-day summer school took three main courses: Modern Machine Learning and Deep Learning Through the Prism of Interpolation, Deep Learning: a Statistical Viewpoint, and Effective Theory of Deep Learning: Beyond the Infinite-Width Limit.
Each course had assigned pre-readings, such as scholarly papers and chapters from text books. The courses also featured problem sets and TA sessions. The main feature of each course were five one-hour lectures on specific topics.
Instructors were not just from academia but also hailed from Facebook, Google and Salesforce.
Varun Khurana, a doctoral student at University of California, San Diego (UCSD), enjoyed the summer school and said he appreciated the mix of theory and industrial point of views.
“Although we had theory-based lectures, which cleared a lot of the misconceptions I had about the mathematical setup of deep learning, the summer school had a large number of industry professionals lecture about their findings and the different research directions of deep learning,” he said. “I got a much deeper understanding of the field and the myriad directions that I could possibly pursue.”
Misha Belkin, a professor at the Halıcıoğlu Data Science Institute at UCSD, taught the first course course on modern machine learning.
“My lecture was about building a theoretical foundation for a number of remarkable mathematical phenomena observed in deep learning practice,” said Belkin. “The approach I took was based on the idea of interpolating models, i.e. statistical models that fit training data exactly.”
Belkin continued, “My goals were two-fold: I aimed to acquaint students with some recent progress toward mathematical foundations of deep understanding on deep learning and elucidate what is known and what is not. And I hoped to intrigue students and encourage them to engage with deep learning research.”
Ethan Dyer, a researcher at Google X, gave a lecture on neural scaling laws, which he and his co-authors published on earlier this year in the paper Explaining Neural Scaling Laws.
“The focus was understanding the properties of neural networks as one scales the model and dataset size,” Dyer said about his talk.
Though the summer school was jampacked with information, Khurana said he greatly enjoyed the lectures and felt they enhanced his education.
“I wanted to get a good overview of where the state-of-the-art research is and what the most interesting and exciting questions to answer are,” he said. “Seeing all the information laid out to us in a short time period forced me to create a cohesive and overarching narrative about the deep learning theory.”