Data Mining (CSCI-B 565 Fall 2023; 3 CR)

    Instructor: Yuzhen Ye (yye@indiana.edu, Luddy 2046)
    AIs: TBD
    Class meets: 1:15-2:30P MW, IF1106
    Office hours and locations; see canvas.

Syllabus

  • Description: Data mining is a dynamic field that has wide applications to many scientific and industrial areas concerning finance, media and entertainment, life sciences, social sciences, medicine, etc. The course objective is to study algorithmic and practical aspects of discovering patterns and relationships in data, which can be used for providing insights and for prediction. This course is designed to introduce basic and some advanced concepts of data mining and provide hands-on experience in data analysis, association rule mining, link analysis, clustering, and prediction. Human factors, security and social issues in data mining will also be discussed. Finally, students will explore ChatGPT and see how they can utilize it as a learning and working partner/assistant.
  • Important dates:
    • Midterm : Week 10, Wed class (Oct 25)
    • Project presentation (Week 15)
    • Project report due (Week 16, Monday Dec 11 midnight)
  • Textbook:
  • References: Kaggle; Hugging Face; Top 10 algorithms in data mining; KDD 2022 papers (KDD2023); DS Interview Questions.
  • Learning outcomes: after taking this course, students shall be able to :
    • Explain fundamental concepts of data mining;
    • Use commonly used data mining techniques and packages (in python);
    • Analyze real-world datasets and identify appropriate data mining techniques to apply thereto;
    • Write a program (in Python or other languages) to implement a data mining algorithm;
    • Conduct data mining experiments and properly report and discuss the results;
    • Understand the common issues in data mining including security and social issues;
    • Present, review and critique data mining articles.
  • Programming languages: python (preferred), R, C/C++, Java
  • Grading: 20% homework assignments; 20% final project; 35% midterm exam (in class) + 25% quizzes (online and in class)
  • Homework: We will have regular homework assignments that will be a mixture of handwritten problems and computing assignments. We will use Canvas and github@IU for homework submission and maintaining records for the course.
  • Some rules:
    • No make-up exam/quiz;
    • No late submission;
    • The final grade will be calculated according to the evaluation scheme given above.
  • Academic integrity:
    • Only turn in your OWN work for individual assignments, and for group project, your team's work. Incidents of academic misconduct will be reported to the Office of the Dean of Students. The typical consequence will be an automatic F in the course. "Plagiarism is using others' [codes,] ideas and words without clearly acknowledging the source of that information." Check out more at (Code of Student Rights, Responsibilities & Conduct).
    • "A student who is found to have committed an act of academic misconduct while enrolled in a class and is assigned a grade of F by the instructor as a result of the misconduct will have the grade of F entered in place of the automatic W which would otherwise have applied." (IU's policy on withdraw)