DataX – using data science to explore and study mass social interactions and online behavior

Written by
Sharon Adarlo
Nov. 18, 2020

It is common for researchers to conduct field studies to study mass social behavior in online networks, such as Facebook or Reddit. These studies lead to qualitative-based conclusions. However, until recently, replicating these interactions in the lab and at scale has been hard to do.

A team of researchers at Princeton University and their collaborators have developed a new system to study mass social behavior in a controlled, rigorous setting. This system involves a software program to automatically recruit people online and using data science techniques to glean actionable intelligence from various interactions. In this way, researchers can learn how online social network behavior impacts global debates such as climate change.

"This is an exciting project because it allows us to simulate societal scale interactions at a larger scale than we could previously," said Tom Griffiths, Henry R. Luce Professor of Information Technology, Consciousness, and Culture of Psychology and Computer Science. "The phenomenon that we're interested in studying is how are norms created, how does information spread, how do you encourage people to act in ways that are better for the community?"

This project was one of the first nine funded last year by Princeton University's Schmidt DataX Fund, which aims to spread and deepen the use of modern tools such as artificial intelligence and machine learning across campus to accelerate scientific discovery. The new fund, made possible through a major gift from Schmidt Futures, was announced in February last year. 

"We are excited about this project because it exemplifies what the DataX Fund is trying to foster: projects that use modern data science to tackle problems that are difficult to study and to run experiments that are arduous to perform at a realistic scale in a laboratory setting," said Peter Ramadge, director of the Center for Statistics and Machine Learning, which oversees parts of the DataX Fund.

Besides Griffiths, the other principal investigators are the following:

  • Alin Coman, associate professor of psychology and public affairs
  • Simon Levin, James S. McDonnell Distinguished University Professor in Ecology and Evolutionary Biology
  • Elizabeth Levy Paluck, professor of psychology and public affairs
  • Elke Weber, the Gerhard R. Andlinger Professor in Energy and the Environment and Professor of Psychology and Public Affairs

In partnership with researchers at Stevens Institute of Technology and Arizona State University, the team has developed a software system that automatically recruits paid participants on Amazon Mechanical Turk. Bill Thompson, the postdoc in Griffiths' lab group running the experiments, explains that this cohort of participants, which can be anywhere from 10 to 200 people, are funneled into an interactive virtual environment where they perform tasks. The software then gathers data from their interactions.

Thompson said, "one experiment is a game that involves extracting resources from an environment that regenerates when harvested equitably, but can collapse if gathered too quickly." 

This particular game, Thompson said, is a study related to collective action problems such as climate change and how people will act when given specific timescales. Since some of the rewards from acting on this issue can take a long time to observe, global warming and timescales are inextricably linked. As shown in numerous articles and research, behavioral scientists and environmental communicators have been diving into how humans make decisions with short-term and long-term consequences and the tension between those two timescales.

At the moment, the researchers are in the study's data-gathering phase, Thompson and Griffiths said. It's too early to see results from these studies. Researchers hope these experiments yield insights on coordinating behavior, viral misinformation, complacency, and hostility in making beneficial energy and environmental decisions.

"We can structure these experiments in interesting ways where people are interacting with one another simultaneously or interacting with one another asynchronously," said Griffiths. "In this environment, they can be moving characters around, and we can look at how they make decisions about what they see on the screen. Throughout the experiment, we're gathering information from the participants on what they are doing."