**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Unit# Laboratory of Information Dynamics and Reviews 2

Laboratory

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related people

Loading

Units doing similar research

Loading

Related research domains

Loading

Related publications

Loading

Related publications (79)

Loading

Loading

Loading

Related people (38)

Related research domains (86)

Wireless sensor network

Wireless sensor networks (WSNs) refer to networks of spatially dispersed and dedicated sensors that monitor and record the physical conditions of the environment and forward the collected data to a ce

Wireless network

A wireless network is a computer network that uses wireless data connections between network nodes. Wireless networking allows homes, telecommunications networks and business installations to avoid

Algorithm

In mathematics and computer science, an algorithm (ˈælɡərɪðəm) is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algo

Units doing similar research (101)

Since the birth of Information Theory, researchers have defined and exploited various information measures, as well as endowed them with operational meanings. Some were born as a "solution to a problem", like Shannon's Entropy and Mutual Information. Others were the fruit of generalisation and the mathematical genius of bright minds like Rényi, Csizsár and Sibson. These powerful objects allow us to manipulate probabilities intuitively and seem always to be somehow connected to concrete settings in communication, coding or estimation theory. A common theme is: take a problem in one of these areas, try to control (upper or lower-bound) the expected value of some function of interest (often, probabilities of error) and, with enough work, an information measure appears as a fundamental limit of the problem. The most striking example of this is in Shannon's seminal paper in 1948: his purpose was to characterise the smallest possible expected length of a uniquely decodable encoding that compresses the realisations of a random variable. As he brilliantly proved, the smallest expected length one can hope for is the Entropy of the random variable. In establishing this connection, another quantity needed to be implicitly controlled: the Kraft's sum of the code. Seemingly unrelated before, these three objects joined forces in harmony to provide a beautiful and fundamental result. But why are they related? The answer seems to be: duality. Duality is an abstract notion commonly used in linear algebra and functional analysis. It has been expanded and generalised over the years. Several incarnations have been discovered throughout mathematics. One particular instance of this involves vector spaces: given two vector spaces and a "duality pairing" one can jump from one space to the other (its dual) through Legendre-Fenchel-like transforms. In the most common settings in Information Theory, the two spaces and the pairing are, respectively: 1) the space of (probability)measures defined on X; 2) the space of bounded functions defined on X; 3) the Lebesgue integral of the function (the expected value of the function if the measure is a probability measure). Once these are set, Legendre-Fenchel-like transforms allow us to connect a) a functional acting on the space described in 1), b) a functional acting on the space described in 2) and the anchor point is c) the (expected) value described in 3).These three pieces (a), b) and c)) represent the actors of many of the results provided in Information Theory. Once they are found, one usually bounds the functional described in b) and obtains a bound connecting the expected value and the functional of measures (e.g., an information measure). Going back to Shannon's result, fixed a random variable (and thus, a probability measure) and selected the function to be the length of a code: the functional a) is the Shannon Entropy of the source; the functional b) is the Kraft sum of the code; the pairing c) is the expected length of the code. We explore this connection and this pattern throughout the thesis. We will see how it can be found in notable results like Coding Theorems for one-to-one codes, Campbell's Coding Theorem, Arikan's Guessing Theorem, Fano-like and Transportation-Cost Inequalities and so on. Moreover, unearthing the pattern allows us to generalise it to other information measures and apply the technique in a variety of fields, including Learning Theory, Estimation Theory and Hypothesis Testing.

Understanding epidemic propagation in large networks is an important but challenging task, especially since we usually lack information, and the information that we have is often counter-intuitive. An illustrative example is the dependence of the final size of the epidemic on the location of the initial infected agents (sources): common sense dictates that the most dangerous location for the sources is the largest city, but the second chapter of the thesis shows that this holds true only if the epidemic is just above the infection threshold.Identifying the initial infected agents can help us better understand the epidemic. The focus of this thesis is on identifying the very first infected agent, also called the source or patient zero. According to the standard assumptions, a few agents reveal their symptom onset time, and then it is our goal to identify the source based on this information, together with full knowledge of the underlying network. Unfortunately, even if we can choose the set of agents that are queried about their symptom onset time, the number of queries required for reliable source identification is too large for practical applications. In this thesis, we carefully assess if this issue can be mitigated by introducing adaptivity to the standard assumptions. Our main goal is to study the reduction in the query complexity if the queries can be chosen adaptively to previous answers, but we also investigate whether adaptively querying the edges can relax the full knowledge assumption on the network.Providing rigorous proofs about source identification with time queries is difficult. A notable exception is when the infection is passed with a known, deterministic delay from each agent to all of its neighbors, in which case the number of required non-adaptive and adaptive queries are equivalent to well-known notions in combinatorics; the metric dimension (MD) and the sequential metric dimension (SMD), respectively. We extend previous results in the field by computing the MD of a large class of random trees, where adaptivity can significantly reduce the query complexity, and the SMD of Erdos-Rényi random networks, where the reduction is found to be small, at most a constant factor. We address the case of non-deterministic diffusion processes for the first time in the mathematical literature: on the path graph, we observe a striking, double logarithmic decrease in adaptive query complexity compared to the non-adaptive case.Our analysis on the robustness of the MD to adding a single edge to specially constructed and d-dimensional grid networks suggests that even small changes in the network could easily derail source identification algorithms. This is concerning since it is difficult to obtain a perfect dataset about the entire contact network in practice. Inspired by recent implementations of contact tracing, we propose new source identification assumptions, where not only the symptom onset times, but also the edges of the network are queried by the algorithm, resulting in less, but potentially higher quality information. We propose two local search algorithms that outperform state of the art identification algorithms tailored to the new assumptions, and we analytically approximate their success probabilities on realistic random graph models. The adaptive assumptions enable us to evaluate our algorithms on a COVID-19 epidemic simulator: the first time that source identification algorithms are tested on such a complex dataset.

Satvik Mehul Mashkaria, Gergely Odor, Patrick Thiran

The metric dimension (MD) of a graph is a combinatorial notion capturing the minimum number of landmark nodes needed to distinguish every pair of nodes in the graph based on graph distance. We study how much the MD can increase if we add a single edge to the graph. The extra edge can either be selected adversarially, in which case we are interested in the largest possible value that the MD can take, or uniformly at random, in which case we are interested in the distribution of the MD. The adversarial setting has already been studied by [Eroh et. al., 2015] for general graphs, who found an example where the MD doubles on adding a single edge. By constructing a different example, we show that this increase can be as large as exponential. However, we believe that such a large increase can occur only in specially constructed graphs, and that in most interesting graph families, the MD at most doubles on adding a single edge. We prove this for $d$-dimensional grid graphs, by showing that $2d$ appropriately chosen corners and the endpoints of the extra edge can distinguish every pair of nodes, no matter where the edge is added. For the special case of $d=2$, we show that it suffices to choose the four corners as landmarks. Finally, when the extra edge is sampled uniformly at random, we conjecture that the MD of 2-dimensional grids converges in probability to $3+\mathrm{Ber}(8/27)$, and we give an almost complete proof.

2022