CS 410 Top: Advanced Data Mining
| Credit Hours: | 4 |
| Course Coordinator: | N/A |
| Course Description: | None...see course goals below |
| Prerequisites: | |
| Goals: | The founder of Lotus, Mitchell Kapor, once said that "getting information off the Internet is like drinking from a fire hydrant". His warning should be taken seriously. Unless we can process the mountain of information that surrounds us, we must either ignore it or be buried by it. This subject introduces automatic data mining methods that find the "pearls in the dust"; i.e. the stuff that really matters. Students in this class will gain an understanding of a range of data mining methods; learn how to contrast different learning methods; and understand the assessment methodologies for data miners. |
| Textbooks: | Data Mining, by Witten and Frankl. |
| References: | |
| Major Topics: | ** Data mining algorithms including 1R, NaiveBayes, C4.5, Apriori, Mprime, Linear regression ** Data pre-processing and Data reporting (lots ot Unix shell scripting with bash and awk) |
| Laboratory Exercises: | Students are given assignments where they are asked to use various data mining algorithms and compare those learners. |
| CAC Category Credits |
Core | | Advanced |
| Data Structures |
| |
| Algorithms |
| |
| Software Design |
| |
| Computer Architecture |
| |
| Programming Languages |
| |
| Oral and Written Communications: | Ability to write technical reports. |
| Social and Ethical Issues: | |
| Theoretical Content: | ** About 30% of class time is spent on a review of data mining theory. ** About 30% of class time is spent on introduction to shell scripting. ** About 30% of class time is spent on assessment methods for data miners. ** Which leaves time for fun! |
| Problem Analysis: | All students learn methods that are used to assess different implementations of data miners. Specific attention is paid to assessing accuracy, runtimes, and stability of the learnt theories. |
| Solution Design: | |
|