Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
We study the problem of identifying the source of a stochastic diffusion process spreading on a graph based on the arrival times of the diffusion at a few queried nodes. In a graph , an unknown source node is drawn uniformly at random, and unknown edge weights for , representing the propagation delays along the edges, are drawn independently from a Gaussian distribution of mean and variance . An algorithm then attempts to identify by querying nodes and being told the length of the shortest path between and in graph weighted by . We consider two settings: \emph{non-adaptive}, in which all query nodes must be decided in advance, and \emph{adaptive}, in which each query can depend on the results of the previous ones. Both settings are motivated by an application of the problem to epidemic processes (where the source is called patient zero), which we discuss in detail. We characterize the query complexity when is an -node path. In the non-adaptive setting, queries are needed for , and for . In the adaptive setting, somewhat surprisingly, only are needed when , and when . This is the first mathematical study of source identification with time queries in a non-deterministic diffusion process.
Volkan Cevher, Grigorios Chrysos, Efstratios Panteleimon Skoulakis