Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Networks are data structures that are fundamental for capturing and analyzing complex interactions between objects. While they have been used for decades to solve problems in virtually all scientific fields, their usage for data analysis in real-world practical applications deserves to be further investigated. In this thesis, we explore multiple aspects of network science and show how the design of new graph-based approaches offers an unprecedented depth for analyzing complex datasets. Through the study of practical applications, we demonstrate how to extract key findings in several domains such as digital humanities, recommender systems, social behavior, neuroscience or knowledge discovery. First, we propose to define in a concise manner the data science workflow. We present the tools, techniques, and questions that the practitioner needs to have in mind when addressing a new large-scale problem as they are of tremendous importance if one wants to apply network science concepts to real applications. Based on this foundation chapter, we begin by demonstrating the worth of networks for music recommendation with Genezik, our smart playlist application that adapts to user taste. Using signal processing, machine learning, and graphs, we show how to improve the performance of recommender systems as well as proposing a radically different user experience that has yet to be found in competing systems. We then move on to the introduction of the causal multilayer graph of activity, a novel graph method dedicated to the analysis of dynamical processes over networks. More than a data structure, we present a data analysis approach that tracks spreading or propagation of events through time in a scalable manner by efficiently combining a network with values associated with its vertices. Used in four different applications, the analysis of spatio-temporal patterns of activity extracted from the causal multilayer graph helps us better understand how rumors spread in social networks or how brain regions interact in resting states for instance. Finally, we study the browsing behavior of millions of people on Wikipedia and show how to extract contextual patterns of activity that reflect what is collectively remembered from past events. Based on their analysis, we confirm social studies on human behavior and conclude by revealing some of the rules that define human curiosity.
Enrico Amico, Benedetta Franceschiello
, ,
, ,