Full course description
Data mining is a major frontier field of computer science that studies the extraction of useful and interesting patterns from large collections of data. This course consists of a step-by-step introduction to data-mining systems. This includes a discussion of the process of acquiring raw data, as well as several pre-processing techniques. Several data-mining techniques are discussed, varying from basic models to state-of-the-art techniques. For each technique various characteristics will be highlighted which help one decide which technique to use. Several evaluation criteria will be discussed which help one decide whether the data-mining system is producing useful patterns. The lectures and labs will emphasize the practical use of the presented techniques and the problems of developing real data-mining applications. A number of real data sets will be analysed and discussed. After completing this course students will have obtained a preliminary methodological and theoretical bases for studying and applying data mining techniques to large collections of data.
I.H. Witten and E. Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques (Second Edition), Morgan Kaufmann, June 2005, ISBN 0-12-088407-0 T. Mitchell (1997). Machine Learning, McGraw-Hill, ISBN 0-07-042807-7.