Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture delves into the common pitfalls encountered in empirical NLP research, focusing on the evaluation of models and the importance of statistical testing. The instructor discusses the exclusive use of the BLEU metric, the impact of hyperparameter tuning, and the significance of statistical power in detecting true differences. Through the analysis of various papers, the lecture emphasizes the need for standardized metrics, reproducibility, and a community effort to improve research quality in the NLP field.