**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Graph Signal Processing

Résumé

Over the past few decades we have been experiencing an explosion of information generated by large networks of sensors and other data sources. Much of this data is intrinsically structured, such as traffic evolution in a transportation network, temperature values in different geographical locations, information diffusion in social networks, functional activities in the brain, or 3D meshes in computer graphics. The representation, analysis, and compression of such data is a challenging task and requires the development of new tools that can identify and properly exploit the data structure. In this thesis, we formulate the processing and analysis of structured data using the emerging framework of graph signal processing. Graphs are generic data representation forms, suitable for modeling the geometric structure of signals that live on topologically complicated domains. The vertices of the graph represent the discrete data domain, and the edge weights capture the pairwise relationships between the vertices. A graph signal is then defined as a function that assigns a real value to each vertex. Graph signal processing is a useful framework for handling efficiently such data as it takes into consideration both the signal and the graph structure. In this work, we develop new methods and study several important problems related to the representation and structure-aware processing of graph signals in both centralized and distributed settings. We focus in particular in the theory of sparse graph signal representation and its applications and we bring some insights towards better understanding the interplay between graphs and signals on graphs. First, we study a novel yet natural application of the graph signal processing framework for the representation of 3D point cloud sequences. We exploit graph-based transform signal representations for addressing the challenging problem of compression of data that is characterized by dynamic 3D positions and color attributes. Next, we depart from graph-based transform signal representations to design new overcomplete representations, or dictionaries, which are adapted to specific classes of graph signals. In particular, we address the problem of sparse representation of graph signals residing on weighted graphs by learning graph structured dictionaries that incorporate the intrinsic geometric structure of the irregular data domain and are adapted to the characteristics of the signals. Then, we move to the efficient processing of graph signals in distributed scenarios, such as sensor or camera networks, which brings important constraints in terms of communication and computation in realistic settings. In particular, we study the effect of quantization in the distributed processing of graph signals that are represented by graph spectral dictionaries and we show that the impact of the quantization depends on the graph geometry and on the structure of the spectral dictionaries. Finally, we focus on a widely used graph process, the problem of distributed average consensus in a sensor network where sensors exchange quantized information with their neighbors. We propose a novel quantization scheme that depends on the graph topology and exploits the increasing correlation between the values exchanged by the sensors throughout the iterations of the consensus algorithm.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Concepts associés (45)

Structure de données

En informatique, une structure de données est une manière d'organiser les données pour les traiter plus facilement. Une structure de données est une mise en œuvre concrète d'un type abstrait.
Obj

Quantification (physique)

En physique, la quantification est une procédure permettant de construire une théorie quantique d'un champ à partir d'une théorie classique de ce champ. On parle parfois de seconde quantification pou

Information

vignette|redresse=0.6|Pictogramme représentant une information.
L’information est un de la discipline des sciences de l'information et de la communication (SIC). Au sens étymologique, l'« informatio

Publications associées (108)

Chargement

Chargement

Chargement

A sensor is a device that detects or measures a physical property and records, indicates, or otherwise responds to it. In other words, a sensor allows us to interact with the surrounding environment, by measuring qualitatively or quantitatively a given phenomena. Biological evolution provided every living entity with a set of sensors to ease the survival to daily challenges. In addition to the biological sensors, humans developed and designed “artificial” sensors with the aim of improving our capacity of sensing the real world. Today, thanks to technological developments, sensors are ubiquitous and thus, we measure an exponentially growing amount of data. Here is the challenge—how do we process and use this data? Nowadays, it is common to design real-world sensing architectures that use the measured data to estimate certain parameters of the measured physical field. This type of problems are known in mathematics as inverse problems and finding their solution is challenging. In fact, we estimate a set of parameters of a physical field with possibly infinite degrees of freedom with only a few measurements, that are most likely corrupted by noise. Therefore, we would like to design algorithms to solve the given inverse problem, while ensuring the existence of the solution, its uniqueness and its robustness to the measurement noise. In this thesis, we tackle different inverse problems, all inspired by real-world applications. First, we propose a new regularization technique for linear inverse problems based on the sensor placement optimization of the sensor network collecting the data. We propose Frame- Sense, a greedy algorithm inspired by frame theory that finds a near-optimal sensor placement with respect to the reconstruction error of the inverse problem solution in polynomial time. We substantiate our theoretical findings with numerical simulations showing that our method improves the state of the art. In particular, we show significant improvements on two realworld applications: the thermal monitoring of many-core processors and the adaptive sampling scheduling of environmental sensor networks. Second, we introduce the dual of the sensor placement problem, namely the source placement problem. In this case, instead of regularizing the inverse problem, we enable a precise control of the physical field by means of a forward problem. For this problem, we propose a near-optimal algorithm for the noiseless case, that is when we know exactly the current state of the physical field. Third, we consider a family of physical phenomena that can be modeled by means of graphs, where the nodes represent a set of entities and the edges model the transmission delay of an information between the entities. Examples of this phenomena are the spreading of a virus within the population of a given region or the spreading of a rumor on a social network. In this scenario, we identify two new key problems: the source placement and vaccination. For the former, we would like to find a set of sources such that the spreading of the information over the network is as fast as possible. For the latter, we look for an optimal set of nodes to be “vaccinated” such that the spreading of the virus is the slowest. For both problems, we propose greedy algorithms directly optimizing the average time of infection of the network. Such algorithms out-perform the current state of the art and we evaluate their performance with a set of experiments on synthetic datasets. Then, we discuss three distinct inverse problems for physical fields characterized by a diffusive phenomena, such as temperature of solid bodies or the dispersion of pollution in the atmosphere. We first study the uniform sampling and reconstruction of diffusion fields and we show that we can exploit the kernel of the field to control and bound the aliasing error. Second, we study the source estimation of a diffusive field given a set of spatio-temporal measurements of the field and under the assumption that the sources can be modeled as a set of Dirac’s deltas. For this estimation problem, we propose an algorithm that exploits the eigenfunctions representation of the diffusion field and we show that this algorithm recovers the sources precisely. Third, we propose an algorithm for the estimation of time-varying emissions of smokestacks from the data collected in the surrounding environment by a sensor network, under the assumption that the emission rates can be modeled as signals lying on low-dimensional subspaces or with a finite rate of innovation. Last, we analyze a classic non-linear inverse problem, namely the sparse phase retrieval. In such a problem, we would like to estimate a signal from just the magnitude of its Fourier transform. Phase retrieval is of interest for many scientific applications, such as X-ray crystallography and astronomy. We assume that the signal of interest is spatially sparse, as it happens for many applications, and we model it as a linear combination of Dirac’s delta. We derive sufficient conditions for the uniqueness of the solution based on the support of the autocorrelation function of the measured sparse signal. Finally, we propose a reconstruction algorithm for the sparse phase retrieval taking advantage of the sparsity of the signal of interest.

We live in a world characterized by massive information transfer and real-time communication. The demand for efficient yet low-complexity algorithms is widespread across different fields, including machine learning, signal processing and communications. Most of the problems that we encounter across these disciplines involves a large number of modules interacting with each other. It is therefore natural to represent these interactions and the flow of information between the modules in terms of a graph. This leads to the study of graph-based information processing framework. This framework can be used to gain insight into the development of algorithms for a diverse set of applications. We investigate the behaviour of large-scale networks (ranging from wireless sensor networks to social networks) as a function of underlying parameters. In particular, we study the scaling laws and applications of graph-based information processing in sensor networks/arrays, sparsity pattern recovery and interactive content search. In the first part of this thesis, we explore location estimation from incomplete information, a problem that arises often in wireless sensor networks and ultrasound tomography devices. In such applications, the data gathered by the sensors is only useful if we can pinpoint their positions with reasonable accuracy. This problem is particularly challenging when we need to infer the positions based on basic information/interaction such as proximity or incomplete (and often noisy) pairwise distances. As the sensors deployed in a sensor network are often of low quality and unreliable, we need to devise a mechanism to single out those that do not work properly. In the second part, we frame the network tomography problem as a well-studied inverse problem in statistics, called group testing. Group testing involves detecting a small set of defective items in a large population by grouping a subset of items into different pools. The result of each pool is a binary output depending on whether the pool contains a defective item or not. Motivated by the network tomography application, we consider the general framework of group testing with graph constraints. As opposed to conventional group testing where any subset of items can be grouped, here a test is admissible if it induces a connected subgraph. Given this constraint, we are interested in bounding the number of pools required to identify the defective items. Once the positions of sensors are known and the defective sensors are identified, we investigate another important feature of networks, namely, navigability or how fast nodes can deliver a message from one end to another by means of local operations. In the final part, we consider navigating through a database of objects utilizing comparisons. Contrary to traditional databases, users do not submit queries that are subsequently matched to objects. Instead, at each step, the database presents two objects to the user, who then selects among the pair the object closest to the target that she has in mind. This process continues until, based on the user’s answers, the database can identify the target she has in mind. The search through comparisons amounts to determining which pairs should be presented to the user in order to find the target object as quickly as possible. Interestingly, this problem has a natural connection with the navigability property studied in the second part, which enables us to develop efficient algorithms.

This thesis is devoted to information-theoretic aspects of community detection. The importance of community detection is due to the massive amount of scientific data today that describes relationships between items from a network, e.g., a social network. Items from this network can be inherently partitioned into a known number of communities, but the partition can only be inferred from the data.
To estimate the underlying partition, data scientists can apply any type of advanced statistical techniques; but the data could be very noisy, or the number of data is inadequate. A fundamental question here is about the possibility of weak recovery: does the data contain a sufficient amount of information that enables us to produce a non-trivial estimate of the partition?
For the purpose of mathematical analysis, the above problem can be formulated as Bayesian inference on generative models. These models, including the stochastic block model (SBM) and censored block model (CBM), consider a random graph generated based on a hidden partition that divides the nodes in the graph into labelled groups. In the SBM, nodes are connected with a probability depending on the labels of the endpoints. Whereas, in the CBM, hidden variables are measured through a noisy channel, and the measurement outcomes form a weighted graph. In both models, inference is the task of recovering the hidden partition from the observed graph. The criteria for weak recovery can be studied via an information-theoretic quantity called mutual information. Once the asymptotic mutual information is computed, phase transitions for the weak recovery can be located.
This thesis pertains to rigorous derivations of single-letter variational expressions for the asymptotic mutual information for models in community detection. These variational expressions, known as the replica predictions, come from heuristic methods of statistical physics. We present our development of new rigorous methods for confirming the replica predictions. These methods are based on extending the recently introduced adaptive interpolation method.
We prove the replica prediction for the SBM in the dense-graph regime with two groups of asymmetric size. The existing proofs in the literature are indirect, as they involve mapping the model to an external problem whose mutual information is determined by a combination of methods. Here, on the contrary, we provide a self-contained and direct proof.
Next, we extend this method to sparse models. Before this thesis, adaptive interpolation was known for providing a conceptually simple proof for replica predictions for dense graphs. Whereas, for a sparse graph, the replica prediction involves a more complicated variational expression, and rigorous confirmations are often lacking or obtained by rather complicated methods. Therefore, we focus on a simple version of CBM on sparse graphs, where hidden variables are measured through a binary erasure channel, for which we fully prove the replica prediction by the adaptive interpolation.
The key for extending the adaptive interpolation to a broader class of sparse models is a concentration result for the so-called "multi-overlaps". This concentration forms the basis of the replica "symmetric" prediction. We prove this concentration result for a related sparse model in the context of physics. This provides inspiration for further development of the adaptive interpolation.