Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Applying classical time-series analysis techniques to online content is challenging, as web data tends to have data quality issues and is often incomplete, noisy, or poorly aligned. In this paper, we tackle the problem of predicting the evolution of a time series of user activity on the web in a manner that is both accurate and interpretable, using related time series to produce a more accurate prediction. We test our methods in the context of predicting signatures for online petitions using data from thousands of petitions posted on The Petition Site - one of the largest platforms of its kind. We observe that the success of these petitions is driven by a number of factors, including promotion through social media channels and on the front page of the petitions platform. We propose an interpretable model that incorporates seasonality, aging effects, self-excitation, and external effects. The interpretability of the model is important for understanding the elements that drives the activity of an online content. We show through an extensive empirical evaluation that our model is significantly better at predicting the outcome of a petition than state-of-the-art techniques.
Daniel Gatica-Perez, Haeeun Kim