Publication# Capturing correlation in large-scale route choice models

Résumé

When using random utility models for a route choice problem, choice set generation and correlation among alternatives are two issues that make the modeling complex. In this paper we discuss different models capturing path overlap. First, we analyze several formulations of the Path Size Logit model proposed in the literature and show that the original formulation should be used. Second, we propose a modeling approach where the path overlap is captured with a subnetwork. A subnetwork is a simplification of the road network only containing easy identifiable and behaviorally relevant roads. In practice, the subnetwork can easily be defined based on the route network hierarchy. We propose a model where the subnetwork is used for defining the correlation structure of the choice model. The motivation is to explicitly capture the most important correlation without considerably increasing the model complexity. We present estimation results of a factor analytic specification of a mixture of Multinomial Logit model, where the correlation among paths is captured both by a Path Size attribute and error components. The estimation is based on a GPS dataset collected in the Swedish city of Borlänge. The results show a significant increase in model fit for the Error Component model compared to a Path Size Logit model. Moreover, the correlation parameters are significant.

Discrete choice

In economics, discrete choice models, or qualitative choice models, describe, explain, and predict choices between two or more discrete alternatives, such as entering or not entering the labor market, or choosing between modes of transport. Such choices contrast with standard consumption models in which the quantity of each good consumed is assumed to be a continuous variable. In the continuous case, calculus methods (e.g. first-order conditions) can be used to determine the optimum amount chosen, and demand can be modeled empirically using regression analysis.

Intraclass correlation

In statistics, the intraclass correlation, or the intraclass correlation coefficient (ICC), is a descriptive statistic that can be used when quantitative measurements are made on units that are organized into groups. It describes how strongly units in the same group resemble each other. While it is viewed as a type of correlation, unlike most other correlation measures, it operates on data structured as groups rather than data structured as paired observations.

Multinomial logistic regression

In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which may be real-valued, binary-valued, categorical-valued, etc.).

Selected Topics on Discrete Choice

Discrete choice models are used extensively in many disciplines where it is important to predict human behavior at a disaggregate level. This course is a follow up of the online course “Introduction t

