Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Online rating systems are a standard tool used to share, discuss and sometimes sell products. The study of how the users interact with others inside these communities is fundamental to the better understanding of our society. In most of the literature, only one community is analyzed at a time. This thesis uses the data from two websites on beer reviews: BeerAdvocate and RateBeer. These data are matched to create a subset of products and users present in both communities. This thesis uses the newly matched datasets to study different social effects happening inside these communities. The results show that users have the tendency to follow the crowd's judgment and the first ratings influence this judgment. The use of Natural Language Processing demonstrates that the communities tend to build their own vocabulary. The users who rate beers on both websites have the tendency to copy-paste the text of their reviews, but they give different ratings in both communities. A Randon Forest model is finally used to predict the reviews between both websites. This collection of results demonstrates the power of using a subset of products and users present in two communities, and the use of analyses at the individual level. These results are used to give different advice to understand and build better online rating systems.
Robert West, Tiziano Piccardi, Giovanni Colavizza