Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The emergence of Big Data has enabled new research perspectives in the discrete choice community. While the techniques to estimate Machine Learning models on a massive amount of data are well established, these have not yet been fully explored for the estimation of statistical Discrete Choice Models based on the random utility framework. In this article, we provide new ways of dealing with large datasets in the context of Discrete Choice Models. We achieve this by proposing new efficient stochastic optimization algorithms and extensively testing them alongside existing approaches. We develop these algorithms based on three main contributions: the use of a stochastic Hessian, the modification of the batch size, and a change of optimization algorithm depending on the batch size. A comprehensive experimental comparison of fifteen optimization algorithms is conducted across ten benchmark Discrete Choice Model cases. The results indicate that the HAMABS algorithm, a hybrid adaptive batch size stochastic method, is the best performing algorithm across the optimization benchmarks. This algorithm speeds up the optimization time by a factor of 23 on the largest model compared to existing algorithms used in practice. The integration of the new algorithms in Discrete Choice Models estimation software will significantly reduce the time required for model estimation and therefore enable researchers and practitioners to explore new approaches for the specification of choice models.
Nikolaos Geroliminis, Claudia Bongiovanni, Mor Kaspi