DataX - Pranay Anchuri: using big data to tackle social science and technology 

Wednesday, May 27, 2020
by Sharon Adarlo

When Pranay  Anchuri  was a college sophomore studying computer science, he worked on a project developing a search engine to crawl through Wikipedia for information. It was not a novel idea, but it got  Anchuri  hooked  on learning more about large scale data sets, leading him to his eventual career as a data scientist.  
“I became very interested in data mining. I wanted to focus on big data. I wanted to learn more and contribute to scientific research in that sector,” he said.  
Anchuri  followed  his curiosity, from college in India to graduate school in America, stints working on blockchain to cryptography at various locales across the world, and finally to his current position as a data scientist at Princeton University.   
Anchuri’s position within the Princeton research community is supported by the Schmidt DataX Fund, which aims to spread and deepen the use of artificial intelligence and machine learning across campus to speed scientific discovery.  Princeton announced this new fund in February 2019, a gift of Schmidt Futures, with the Center for Statistics and Machine Learning (CSML) playing a role in shepherding the initiative. This fund will ultimately support six Schmidt Data Scientists - Anchuri being one of them.
The mission of the data scientists is to spearhead developing and improving data-analyses to enable more impactful research in three areas on campus: the Princeton Catalysis Initiative, the biomedical data science initiative, and the Center for Information Technology Policy (CITP).  
“We are pleased that  Pranay is  part of the data science community on campus,” said Peter J. Ramadge, CSML director. “He brings valuable industrial expertise and the agility to work on the interdisciplinary projects that  DataX  is funding.”  

Anchuri  is working with the researchers at CITP, which focuses on issues at the nexus of technology, engineering, public policy, and the social sciences.  
“It’s been a pleasure working here,” said  Anchuri. “I am working with passionate faculty members and students. The research we are working on is interesting and interdisciplinary. It’s the perfect opportunity for learning and making significant impact because the projects we are working on overlap different research areas and deal with real world issues.”  
For CITP,  Anchuri is working on Princeton’s Fragile Families & Child Wellbeing Study, which tracks the life trajectory of nearly 5,000 children born in large U.S. cities between 1998 and 2000. Matthew  Salganik - sociology professor, interim CITP director, and affiliate CSML faculty member – is heading this research project.   
The other two projects that Anchuri is part of at CITP involve blockchain with Arvind Narayanan, an associate professor, and research focusing on people’s interaction with news on the Internet with Jonathan Mayer, an assistant professor. Both professors are members of Princeton’s computer science department.  
Before coming to Princeton,  Anchuri was  a blockchain research scientist at  startup company  Axoni in New York City and a research staff member at NEC Laboratories America,  in Princeton. 
He earned a bachelor’s degree in computer science with a focus on data mining  from  the  International Institute of Information Technology in Hyderabad, India, and then got his doctoral degree in computer science from  Rensselaer Polytechnic Institute in Troy, New York.   
The focus of his doctoral research was on  approximate graph mining, specifically  developing algorithms for finding  frequent  subgraphs from large  networks. 
These networks could, for example,  represent friendships in a social network or interactions between proteins in an organism.  
“You have a large  network  and there are repetitions of some pattern that may occur frequently in a network,” he said. “Those repetitions may correspond to a certain behavior in the network. For example, these network interactions can be proteins interacting with each other and can correspond with a  biological process such as proteolysis  i.e. they act as enzymes that lead to the  breakdown  of  other proteins into amino acids. 
While studying for his doctoral degree,  Anchuri also  served internships at IBM  Thomas J. Watson Research Center in Yorktown  Heights, New York;  LinkedIn in Mountain View, California; Yahoo  Research  at Barcelona; and IBM  Research  in  Dublin. He was also a research associate at Qatar Computing Research Institute in Doha, Qatar.  
Anchuri said he is excited that his background of industrial work on big scale projects and academic research will help spur and increase the use of machine learning on campus.  
“I am approaching my work by applying best practices to the projects,” he said. “I want to help amplify data-driven research on campus and make an impact.”