**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Publication# Trajectory analysis using point distribution models

Abstract

This thesis focuses on the analysis of the trajectories of a mobile agent. It presents different techniques to acquire a quantitative measure of the difference between two trajectories or two trajectory datasets. A novel approach is presented here, based on the Point Distribution Model (PDM). This model was developed by computer vision scientists to compare deformable shapes. This thesis presents the mathematical reformulation of the PDM to fit spatiotemporal data, such as trajectory information. The behavior of a mobile agent can rarely be represented by a unique trajectory, as its stochastic component will not be taken into account. Thus, the PDM focuses on the comparison of trajectory datasets. If the difference between datasets is greater than the variation within each dataset, it will be observable in the first few dimensions of the PDM. Moreover, this difference can also be quantified using the inter-cluster distance defined in this thesis. The resulting measure is much more efficient than visual comparisons of trajectories, as are often made in existing scientific literature. This thesis also compares the PDM with standard techniques, such as statistical tests, Hidden Markov Models (HMMs) or Correlated Random Walk (CRW) models. As a PDM is a linear transformation of space, it is much simpler to comprehend. Moreover, spatial representations of the deformation modes can easily be constructed in order to make the model more intuitive. This thesis also presents the limits of the PDM and offers other solutions when it is not adequate. From the different results obtained, it can be pointed out that no universal solution exists for the analysis of trajectories, however, solutions were found and described for all of the problems presented in this thesis. As the PDM requires that all the trajectories consist of the same number of points, techniques of resampling were studied. The main solution was developed for trajectories generated on a track, such as the trajectory of a car on a road or the trajectory of a pedestrian in a hallway. The different resampling techniques presented in this thesis provide solutions to all the experimental setups studied, and can easily be modified to fit other scenarios. It is however very important to understand how they work and to tune their parameters according to the characteristics of the experimental setup. The main principle of this thesis is that analysis techniques and data representations must be appropriately selected with respect to the fundamental goal. Even a simple tool such as the t-test can occasionally be sufficient to measure trajectory differences. However, if no dissimilarity can be observed, it does not necessarily mean that the trajectories are equal – it merely indicates that the analyzed feature is similar. Alternatively, other more complex methods could be used to highlight differences. Ultimately, two trajectories are equal if and only if they consist of the exact same sequence of points. Otherwise, a difference can always be found. Thus, it is important to know which trajectory features have to be compared. Finally, the diverse techniques used in this thesis offer a complete methodology to analyze trajectories.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts

Loading

Related publications

Loading

Related concepts (19)

Related publications (1)

Analysis

Analysis (: analyses) is the process of breaking a complex topic or substance into smaller parts in order to gain a better understanding of it. The technique has been applied in the study of mathema

Quantitative research

Quantitative research is a research strategy that focuses on quantifying the collection and analysis of data. It is formed from a deductive approach where emphasis is placed on the testing of theory,

Hidden Markov model

A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("hidden") states. As par

Loading

We are living in the era of "Big Data", an era characterized by a voluminous amount of available data. Such amount is mainly due to the continuing advances in the computational capabilities for capturing, storing, transmitting and processing data. However, it is not always the volume of data that matters, but rather the "relevant" information that resides in it.
Exactly 70 years ago, Claude Shannon, the father of information theory, was able to quantify the amount of information in a communication scenario based on a probabilistic model of the data. It turns out that Shannon's theory can be adapted to various probability-based information processing fields, ranging from coding theory to machine learning. The computation of some information theoretic quantities, such as the mutual information, can help in setting fundamental limits and devising more efficient algorithms for many inference problems.
This thesis deals with two different, yet intimately related, inference problems in the fields of coding theory and machine learning. We use Bayesian probabilistic formulations for both problems, and we analyse them in the asymptotic high-dimensional regime. The goal of our analysis is to assess the algorithmic performance on the first hand and to predict the Bayes-optimal performance on the second hand, using an information theoretic approach. To this end, we employ powerful analytical tools from statistical physics.
The first problem is a recent forward-error-correction code called sparse superposition code. We consider the extension of such code to a large class of noisy channels by exploiting the similarity with the compressed sensing paradigm. Moreover, we show the amenability of sparse superposition codes to perform
joint distribution matching and channel coding.
In the second problem, we study symmetric rank-one matrix factorization, a prominent model in machine learning and statistics with many applications ranging from community detection to sparse principal component analysis. We provide an explicit expression for the normalized mutual information and the minimum mean-square error of this model in the asymptotic limit. This allows us to prove the optimality of a certain iterative algorithm on a large set of parameters.
A common feature of the two problems stems from the fact that both of them are represented on dense graphical models. Hence, similar message-passing algorithms and analysis tools can be adopted. Furthermore, spatial coupling, a new technique introduced in the context of low-density parity-check (LDPC) codes, can be applied to both problems. Spatial coupling is used in this thesis as a "construction technique" to boost the algorithmic performance and as a "proof technique" to compute some information theoretic quantities.
Moreover, both of our problems retain close connections with spin glass models studied in statistical mechanics of disordered systems. This allows us to use sophisticated techniques developed in statistical physics. In this thesis, we use the potential function predicted by the replica method in order to prove the threshold saturation phenomenon associated with spatially coupled models. Moreover, one of the main contributions of this thesis is proving that the predictions given by the "heuristic" replica method are exact. Hence, our results could be of great interest for the statistical physics community as well, as they help to set a rigorous mathematical foundation of the replica predictions.