Daisy Yan Huang: boosting non-STEM majors, women

Wednesday, Mar 18, 2020
by Sharon Adarlo

Photo of Daisy Yan Huang teaching.

When Daisy Yan Huang teaches, she notices that some women in her classes are timid or hesitant to speak up, even though they are just as capable and smart as their male peers.

Huang, a lecturer at Princeton University’s Center for Statistics and Machine Learning (CSML), is determined to change that. Leading by example, she strives to instill confidence in her female students in her class and mentors many of them long after they have left her class.

“I would like to help them see what they are capable of,” said Huang. “As a woman and a child of parents without any science background, I think my success was implausible without the encouragement and guidance of many of my professors. I would like to continue that tradition and help my students to recognize their potential.”

Huang is working her influence through the two courses she teaches at CSML: SML 201 – Introduction to Data Science and SML 310 – Research Projects in Data Science. 

“I enjoy interacting with my students,” she said. “I believe that statistical thinking is an essential skill and the fundamentals of statistics should be accessible to students at the freshman level.”

“We are pleased that Daisy is part of CSML,” said Peter Ramadge, CSML director. “She serves as an important role model to her students and is an essential part of our community. She’s teaching future data scientists and upcoming leaders who are entering the work force with a vital skill.”

Daisy Yan Huang teaching a class.

Huang did not begin her schooling thinking she was going into STEM, let alone data science. She studied art and graphic design growing up. Her mother teaches Cantonese Chinese opera while her father composes and teaches. 

After she moved to America from China, she eventually landed on the path of studying mathematics at the University of California, Berkeley, ultimately earning a bachelor’s degree with a double major in applied mathematics and statistics. 

She then went onto earn her doctoral degree in statistics at Berkeley. Her dissertation topic focused on developing a robust statistical ranking method for high volume data with small sample sizes.  Such data occur naturally in genetic applications. The ranking method allows biologists to focus on a small subset of the genes to study their roles in biological processes and their association with certain diseases. Ranking algorithms have broad applications; for example, they are used in generating Amazon product recommendations, Netflix movie recommendations and Google search results, said Huang.

After graduate school, Huang worked at Amazon as a machine learning scientist. She found her experience there to be valuable, adding richness to her academic background.

“The aspect of working at Amazon that I benefited most from was learning how to formulate an analysis to answer a business question and to guide business decisions,” she said. “This is not something that was taught in grad school.” 

An important observation she came away with from her time at Amazon was that people would often look at statistical models as a “black box” without checking the assumptions that affect the performance of the model. She hopes that can be changed starting from the lower level courses at universities.

At the end of 2016, Huang moved from the West Coast to New Jersey for a job opportunity for her husband. Huang said she was mulling on taking a job once again in industry, but the opportunity to be a lecturer on campus sprung up. Huang started off teaching data science at the molecular biology department in 2017 and then switched to CSML where she has been ever since.

“I have always liked teaching, and teaching was my main motivation to go to grad school,” she said. “I was hoping that eventually I could go back to teaching once I got a better sense about how the tools that I learned at school were used in the real world.”

While she teaches her classes, Huang can’t help but see her younger self in some of her students.

“People who do not have a lot of exposure to data science often feel intimated by the subject. Perhaps they had bad experience with the subject before, or perhaps someone told them that data science is not for them because of their background,” said Huang.

“The bigger picture of making inference and discoveries with data can get obscured by the languages of formulas and programming code. I want to convey the intuition and principles that underline the techniques in data science. The idea of linear regression, drawing a line through data, is intuitive to most people without any statistical background. It is important not to let the mathematics intimate students and loss sight of the core concepts,” Huang continued.

She hopes that students with various background who lack confidence in their abilities eventually shed their insecurities and find enjoyment in scientific exploration as much as she does.

“Growing up I was told numerous times that ‘girls were not good at science’ or ‘artistic people and science people don’t mix,’” she said.  “However, if time after time you have done well, at some point you really need to stop worrying about what other people say and believe in yourself.”