Ming Yin

Email: mingyin [AT]

Teaching Assistants


Course Description

Data mining has emerged at the confluence of artificial intelligence, statistics, and databases as a technique for automatically discovering summary knowledge in large datasets. This course introduces students to the process and main techniques in data mining, including basic data visualization and exploratory analysis, predictive modeling, descriptive modeling, and pattern mining approaches. Data mining systems and applications will also be covered, along with selected topics in current research.

Learning Objectives

Upon completing the course, students should be able to:

Course Schedule

See the calendar page.


Final Project

Final project serves as an opportunity for students to get hands-on experience in data mining and practice the techniques and algorithms they learn in this course in real-life data mining scenarios. Projects are open-ended. Students are asked to:

Students should complete the project in teams of 2-5 people. Tasks related to the final project include:

More detailed instruction on the final project will be provided through project guidelines.


STAT516 or an equivalent introductory statistics course, CS 381 or an equivalent course that covers basic programming skills (e.g., STAT 598G).


The primary text of the class is:

Late Policy

Assignments need to be submitted by the due date listed. Each student gets three extension days which can be applied to any combination of assignments during the semester, except for Assignment 1, without penalty. Students must explicitly state in the assignment submission the number of extension days used, and cannot be rearranged after they are applied.

Beyond extension days, a late penalty of 10% per day will be applied to assignments that are submitted after the due date. However, assignments will NOT be accepted if they are more than 5 days late.

No extensions or late days are allowed for any project-related due date.