Faculty Profile: Olga Russakovsky harnesses machine learning to detect human actions in videos

Monday, Jul 22, 2019
by Sharon Adarlo

Olga Russakovsky, Assistant Professor of Computer Science


Russakovsky is affiliated with Princeton University’s Center for Statistics and Machine Learning (CSML) and Princeton’s Center for Information Technology Policy.

She earned a bachelor’s degree in mathematics, and concurrently a master’s degree in computer science, from Stanford University in 2007. Her thesis was titled, “Algorithms for Training Conditional Log-Linear Models.” She continued at Stanford and earned her doctoral degree in computer science in 2015 with a thesis that was called, "Scaling Up Object Detection.”

Before joining the computer science department at Princeton in 2017, Russakovsky was a postdoctoral research fellow at Carnegie Mellon University’s Robotics Institute and served as a research intern at NEC Labs America’s media analytics team. She was awarded the MIT Technology Review’s 35 Innovators Under 35 Award, and Foreign Policy's 100 Leading Global Thinkers. She has also been featured in more than a dozen articles, from New York Times to Forbes.


Russakovsky’s research focuses on developing computer vision, which uses machine learning as its core technology.

“I am interested in developing artificially intelligent systems that are able to reason about the visual world,” she said.

In the realm of computer vision, people are most familiar with facial recognition - a classification protocol - but Russakovsky’s lab, Princeton Visual AI Lab, has been working on something more advanced: having the computer recognize what kind of actions people are doing in videos. In effect, they are reasoning about what’s going on, she said.

Since there have been significant advances in detecting and categorizing static images, this is the next frontier for computer vision, Russakovsky said.

As stated in the 2017 paper “What Actions are Needed for Understanding Human Actions in Videos?,” on which Russakovsky is one of the authors:

“When it comes to video understanding, we are still struggling to figure out basic questions such as: What is an activity and how should we represent it? Do activities have well-defined spatial and temporal extent? What role do goals and intentions play in defining and understanding activities?”

The authors of the paper posit that the creation of new video datasets is one of the keys in developing better digital recognition of human activity.

Russakovsky is also passionate about ethical issues in AI and promoting diversity in computer science.

“We want to make sure our systems are representative of all people,” she said, noting a case where that didn’t happen: a facial recognition technology that worked well on white males but failed when it came with black females.

Fixing these issues requires greater diversity in the ranks of people who make artificial intelligence systems. This matters, Russakovsky said, because algorithms are starting to make decisions in the real world and if they don’t work for certain people, then the technology is unfair and flawed.

In order to make sure these biases are not baked in, Russakovsky said a team needs to be inclusive.

“There is a talent crisis. We need more researchers. We won’t have enough people if we exclude certain groups. It would be a missed opportunity. Better solutions will arise with more brains, and a more diverse set of different brains.”