Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The acquisition of survey responses is a crucial component in conducting research aimed at comprehending public opinion. However, survey data collection can be arduous, time-consuming, and expensive, with no assurance of an adequate response rate. In this paper, we propose a pioneering approach for predicting survey responses by examining quotations using machine learning. Our investigation focuses on evaluating the degree of favorability towards the United States, a topic of interest to many organizations and governments. We leverage a vast corpus of quotations from individuals across different nationalities and time periods to extract their level of favorability. We employ a combination of natural language processing techniques and machine learning algorithms to construct a predictive model for survey responses. We investigate two scenarios: first, when no surveys have been conducted in a country, and second when surveys have been conducted but in specific years and do not cover all the years. Our experimental results demonstrate that our proposed approach can predict survey responses with high accuracy. Furthermore, we provide an exhaustive analysis of the crucial features that contributed to the model's performance. This study has the potential to impact survey research in the field of data science by substantially decreasing the cost and time required to conduct surveys while simultaneously providing accurate predictions of public opinion.