Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.
This dissertation introduces traffic forecasting methods for different network configurations and data availability.Chapter 2 focuses on single freeway cases.Although its topology is simple, the non-linearity of traffic features makes this prediction still a challenging task.We propose the dynamic linear model (DLM) to approximate the non-linear traffic features. Unlike a static linear regression model, the DLM assumes that its parameters change over time.We design the DLM with time-dependent model parameters to describe the spatiotemporal characteristics of time-series traffic data. Based on our DLM and its model parameters analytically trained using historical data, we suggest the optimal linear predictor in the minimum mean square error (MMSE) sense.We compare our prediction accuracy by estimating expected travel time based on the traffic prediction for freeways in California (I210-E and I5-S) under highly congested traffic conditions with other baselines. We show significant improvements in accuracy, especially for short-term prediction.Chapter 3 aims to generalize the DLM to extensive freeway networks with more complex topologies.Most resources would be consumed to estimate unnecessary spatiotemporal correlations if the DLM was directly used for a large-scale network.Defining features on graphs relaxes such issues by cutting unnecessary connections in advance based on predefined topology information.Exploiting the graph signal processing, we represent traffic dynamics over freeway networks using multiple graph heat diffusion kernels and integrate the kernels into DLM with Bayes' rule. We optimize the model parameters using Bayesian inference to minimize the prediction errors.The proposed model demonstrates prediction accuracy comparable to state-of-the-art deep neural networks with lower computational effort. It notably achieves excellent performance for long-term prediction through the inheritance of periodicity modeling in DLM.Chapter 4 proposes a deep neural network model to predict traffic features on large-scale freeway networks.These days, deep learning methods have heavily tackled traffic forecasting problems of freeway networks because they are outstanding at learning highly complex correlations between variables both in time and space, which the linear models might be limited to.Adopting a graph convolutional network (GCN) becomes a standard to extract spatial correlations; therefore, most works have achieved great prediction accuracy by implanting it into their architecture.However, the conventional GCN has the drawback that receptive field size should be small, i.e., barely refers to traffic features of remote sensors, resulting in inaccurate long-term prediction.We suggest a forecasting model called two-level resolution deep neural network (TwoResNet) that overcomes the limitation.It consists of two resolution blocks:The low-resolution block predicts traffic on a macroscopic scale, such as regional traffic changes.On the other hand, the high-resolution block predicts traffic on a microscopic scale by using GCN to extract spatial correlations, referring to the regional changes produced by the low-resolution block.This process allows the GCN to refer to the traffic features from remote sensors.As a result, TwoResNet achieves competitive prediction accuracy compared to state-of-the-art methods, especially showing excellent performance for long-term predictions.
Chargement
Chargement
Chargement
Chargement
Chargement
black-boxes''. The Law of Parsimony states that
simpler solutions are more likely to be correct than complex ones''. Since they perform quite well in practice, a natural question to ask, then, is in what way are neural networks simple?
We propose that compression is the answer. Since good generalization requires invariance to irrelevant variations in the input, it is necessary for a network to discard this irrelevant information. As a result, semantically similar samples are mapped to similar representations in neural network deep feature space, where they form simple, low-dimensional structures.
Conversely, a network that overfits relies on memorizing individual samples. Such a network cannot discard information as easily.
In this thesis we characterize the difference between such networks using the non-negative rank of activation matrices. Relying on the non-negativity of rectified-linear units, the non-negative rank is the smallest number that admits an exact non-negative matrix factorization.
We derive an upper bound on the amount of memorization in terms of the non-negative rank, and show it is a natural complexity measure for rectified-linear units.
With a focus on deep convolutional neural networks trained to perform object recognition, we show that the two non-negative factors derived from deep network layers decompose the information held therein in an interpretable way. The first of these factors provides heatmaps which highlight similarly encoded regions within an input image or image set. We find that these networks learn to detect semantic parts and form a hierarchy, such that parts are further broken down into sub-parts.
We quantitatively evaluate the semantic quality of these heatmaps by using them to perform semantic co-segmentation and co-localization. In spite of the convolutional network we use being trained solely with image-level labels, we achieve results comparable or better than domain-specific state-of-the-art methods for these tasks.
The second non-negative factor provides a bag-of-concepts representation for an image or image set. We use this representation to derive global image descriptors for images in a large collection. With these descriptors in hand, we perform two variations content-based image retrieval, i.e. reverse image search. Using information from one of the non-negative matrix factors we obtain descriptors which are suitable for finding semantically related images, i.e., belonging to the same semantic category as the query image. Combining information from both non-negative factors, however, yields descriptors that are suitable for finding other images of the specific instance depicted in the query image, where we again achieve state-of-the-art performance.