The Center for Statistics and Machine Learning (CSML) is offering this new course in spring 2023. Daisy Yan Huang, CSML core faculty member and lecturer, will be leading the course.
“We are pleased to broaden the course options for students interested in delving further into applications of data science, statistics, and machine learning,” said Peter Ramadge, CSML director. “This course should enhance students' ability to understand and apply modern data analysis methods in an academic and industry research setting.”
The course aims to foster the ability to plan and perform rigorous data analyses. Students are expected to learn the conceptual underpinnings behind advanced modern methods and to use this knowledge to program appropriate analyses.
“After taking the CSML course, SML 201 - Introduction to Data Science, many people understand the basics of the discipline. The natural next step is to explore more advanced applications. In SML 301, this is where the intellectual excitement and fun take off. This course strongly emphasizes concepts and applications,” said Huang, who also teaches SML 201. “We will talk more about how to use the tools in addition to the intuition behind the methods. Like SML 201, this course is about both ‘understanding’ and ‘doing’.”
Huang added, “Many times, I feel students understand the concepts, but when they have a real-life problem to solve, they have trouble applying their understanding. This course will bridge that gap.”
The course will cover ethics and bias in dealing with data and research studies, experiment design, best practices for data science and machine learning methods, and the importance of reproducible research. It will also explore methods ranging from random forests to deep neural networks.
During the course, students will work on two design projects and a final research paper/project. Through these projects, they will learn how to deal with common issues in modern datasets, and build and evaluate models.
“I want to make data science accessible to a larger population. This course could be for someone who has already taken some courses and knows the theory but wants more hands-on experience with real datasets. But it could also be for someone who is seeing these concepts for the first time and wants to learn how to use these tools correctly without going through the rigorous underlying math behind the methods,” said Huang.
After taking the class, students should be well-equipped to tackle independent research or to hone their skills further with the many advanced course options on campus.
“I am excited to teach this course,” said Huang. “It’s going to be a fun semester where students will make connections between concepts and applications.”
Currently, the cap for SML 301 is 30 students. Enrollment starts on November 29th.