The Whats, Whys, And Hows of Using Google Trends Data for Constructing Social Science Indicators

Proceedings of The 7th International Conference on Research in Business, Management and Economics

Year: 2023

DOI:

[PDF]

The Whats, Whys, And Hows of Using Google Trends Data for Constructing Social Science Indicators

Ivana Lolić, Marina Matošec, Petar Sorić

 

 

ABSTRACT: 

All branches of social sciences often require high-frequency estimates of various latent variables. Google Trends (GT) data offer a unique and rich dataset for that purpose, but the literature still has not provided a detailed methodological review on the caveats and benefits of GT. This paper provides one of the initial contributions in that sense. We build a step-by-step methodological guideline on how to properly construct GT-based social science indicators: from pre-treatment of data (adjustment, filtering) to composite estimation using a battery of machine learning techniques (principal component analysis, random forest model, and dynamic factor model).
As a case study of the usefulness of our methodological framework, we construct a composite indicator of US GDP growth rate and inspect its forecasting accuracy in comparison to a benchmark autoregressive model. Our estimates witness that the random forest model provides the best in-sample and out-of-sample fit with regard to actual US GDP growth, even in tail events such as the COVID-19 crisis. We believe that this approach can be easily generalized to a wide array of applications not only in business and economics, but also in sociology, psychology, political science, etc. As an additional contribution, we freely provide a detailed user-friendly R code for replicating our analysis in future research.

keywords: Google Trends, big data, composite indicators, random forest, dynamic factor model