Machine Learning for Bigdata – Master COMASIC M2 (18 hours,21/02/2013 to 05/03/2013)

Welcome to the web page of the "Machine Learning for Bigdata" course.

Course Objectives

The course aims to familiarize students with advanced machine learning and data mining methods towards design and development of solutions for current data sets that are characterized by complexity and large volume. Real world cases in the area of social media/networks and also on automated web advertising and optimization will be presented.
See course logistics and details at the new e-learning portal: http://moodle.lix.polytechnique.fr/moodle/course/view.php?id=2
with guest login passwd: mldb2013




Course Schedule and slides


    Course contents

    • Supervisedlearning: Linear Regression, Model Selection, Support vector machines (SVMs), Kernels.
    • Unsupervised learning: Gaussian Mixture models, EM algorithm, Spectral Clustering.
    • Data Pre processing: Linear and nonlinear dimensionality reduction, spectral methods, Feature selection, Cross-validation.
    • Big data - An introduction (Hadoop, Mapreduce), No SQL data - Jason, Hive, K-Means in Mapreduce
    • Learning in Graphs: ranking algorithms, evaluation measures, degeneracy and community mining methods.
    • Applications & case studies: Optimization Techniques for Web advertising and Search Engine optimization, Community mining and evaluation in social networks.

    References

    • Bayesian Reasoning and Machine Learning, David Barber, University College London, Cambridge University Press, ISBN:9780521518147, Publication date:February 2012
    • - Pattern Recognition and Machine Learning, Bishop, Christopher M.,Springer, 1st ed. 2006 2006, XX, 740 p, ISBN 978-0-387-31073-2

Data Mining, Knowledge Discovery, Bussiness Intelligence