Machine learning technologies are increasingly being used to analyze complex, ambiguous situations such as the spread of diseases or financial markets, but some of these algorithms falter when they encounter new data, an altered environment, or have hidden biases that come to the surface.
In machine learning, the term “robustness” refers to the quality or degree of an algorithm’s performance reliability. Ideally, an algorithm’s performance should not suffer when faced with new data versus training data. To increase robustness and trust in these technologies, Mert Gurbuzbalaban, a visiting fellow at the Center for Statistics and Machine Learning (CSML) at Princeton University for the 2022-2023 academic year, is exploring robustness in machine learning as part of his overall research agenda.
“One of today's big challenges for machine learning is that our current algorithms are not robust,” said Gurbuzbalaban, an associate professor at Rutgers University’s Department of Management Science and Information Systems. “For example, if you build an algorithm to predict the financial markets, such as the stock price of a company, the model could be based on the last six months of data, but what happens if the pandemic ends, there is a new vaccine, or it’s summer, and people go on vacation. The nature of the data changes. Then the current models we have can give you incorrect results.”
“Basically, if the nature of the data changes a little bit, a current machine learning algorithm can fail,” he said. Another example is MRI imaging. In this case, factors such as the type of equipment or the angle of the device when used can impact an algorithm’s performance.
To address these performance issues, Gurbuzbalaban said, one needs distributionally robust machine learning, which, by design, is less fragile. Gurbuzbalaban has been looking at this technology and is discussing possible collaborative projects on the topic with several Princeton researchers.
During his time at CSML, Gurbuzbalaban also hosted two lunch and learn seminars. On November 21, 2022, he gave a talk titled “Robust and Risk-Averse Accelerated Gradient Methods.” For another lunch and learn seminar on November 28th, he invited Yassine Laguel, a postdoctoral researcher at Rutgers and one of Gurbuzbalaban’s advisees, to give a talk titled “Robustness for Models and Algorithms in Machine Learning.”
Robustness is just one aspect of Gurbuzbalaban’s research. His primary focus has been on “optimization and computational science driven by applications in large-scale information, decision and infrastructure systems. My work draws and extends ideas and tools from convex optimization, probability, and the control of systems,” Gurbuzbalaba said.
“I'm an optimizer. I am interested in developing optimization algorithms. But in terms of both theory and practice and as an application, I am interested in optimization with a focus on big data challenges,” Gurbuzbalaban elaborated.
“Traditionally, optimization algorithms are not designed for big problems. Maybe you have a few parameters - for example, optimizing a budget. But in modern data science, we have problems such as translating text to another language. This is a problem with many paraments and requires large-scale models. Traditional optimization models are not working well in this area in terms of their scalability properties. So that leads to my research focus: developing scalable optimization models.”
For his current stay at Princeton, Gurbuzbalaban has high hopes of collaborative research possibilities. Peter Ramadge, the CSML director, and the staff have been very helpful, he added.
“CSML is close to many different departments,” said Gurbuzbalaban. “The data science community is engaging, and there are many seminars you can attend. There is a lot of exciting research happening on campus, and that inspires me as an academic.”