Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers the concepts of overfitting vs underfitting, model selection using cross-validation, LOOCV, k-fold cross-validation, and the importance of penalizing overfitting in machine learning models. It also delves into regularized linear regression, kernel ridge regression, and the significance of finding the right regularization strength. The lecture further explores the need for data representations, the challenges of data heterogeneity, size, and noisiness, and techniques like Bag of Words for text data and visual dictionaries for image data. It concludes with discussions on data pre-processing, handling imbalanced data, sample re-weighting, and the transition from handcrafted representations to learned ones.