Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers variable selection methods in machine learning, focusing on filtering and correlation techniques. Filtering involves applying a criterion, such as correlation with the label, to quantify the relevance of variables. Correlation methods, like Pearson correlation coefficient, measure the relationship between variables and the label. The lecture also discusses the coefficient of determination and mutual information as tools for assessing variable relevance and dependence. Limitations of filtering methods, such as dealing with variable interactions, are highlighted. Additionally, the importance of splitting data into training, validation, and test sets for model evaluation is explained.