Data Mining
Full course description
Data mining is the process of searching for patterns in data. Data mining has become increasingly important in many areas of science and business, from biomedicine to marketing, due to the increasing ability to generate and store enormous amounts of data.
Data mining makes use of machine learning / artificial intelligence algorithms and statistics, as well as effective use of visualisation techniques and database systems. This course will introduce you to the different aspects of data mining, including
- Data pre-processing and exploration
- Data clustering methods and visualisation
- Data modelling using regression and classification
- Association rule learning
This course will highlight the best practices and common mistakes during the data mining process and provide you with hands-on experience using the popular software program ‘R’ (www.r-project.org). You will get insights into the theoretical and algorithmic foundations of data mining and its application in real world examples.
Course objectives
- To understand data mining techniques, concepts, and algorithms
- To get hands-on experience with data mining in R
- To learn about common usage scenarios and pitfalls of data mining
Prerequisites
- VSC1303 Introduction to Statistical Methods and Data Analysis
Recommended VSC2305 Intermediate Statistical Methods and Data Analysis
Recommended reading
Compulsory: None;
Recommended (online sources):
- Mohammed J. Zaki, Wagner Meira, Jr., Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, May 2014. ISBN: 9780521766333, http://www.dataminingbook.info/pmwiki.php/Main/BookDownload;
- Daniel T. Larose, Discovering knowledge in data: An introduction to data mining, John Wiley & Sons, Inc. 2005, Print ISBN: 9780471666578, http://rabiee.iauda.ac.ir/design/iaudaostad/http://rabiee.iauda.ac.ir/my_doc/rabieeiaudaacir/DM/Larose%20-%20Discovering%20knowledge%20in%20data%20%20-%202005%20.pdf;
- Trevor Hastie, Robert Tibshirani, Jerome Friedman, The elements of statistical learning: Data mining, inference, and prediction, second edition, Springer Series in Statistics, 2009, ISBN: 9780387848587 http://statweb.stanford.edu/~tibs/ElemStatLearn/printings/ESLII_print10.pdf;
- Data Mining Algorithms in R, WikiBooks, https://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R