Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers common data problems such as incorrect, duplicate, inconsistent, missing, and outlier data, along with best practices for handling missing data. It also delves into important distributions like normal, Poisson, exponential, binomial, and Bernoulli distributions, explaining their properties and examples. Additionally, it explores concepts like Pearson's correlation, mutual information, and their applications in analyzing dependencies between variables.