期刊论文详细信息
Forecasting
Assessing Spurious Correlations in Big Search Data
article
Jesse T. Richman1  Ryan J. Roberts2 
[1] Department of Political Science and Geography, Old Dominion University;Department of Public Service, Gardner-Webb University
关键词: spurious correlation;    Bonferroni;    big data;    big search data;    Google Correlate;    Google Trends;    search data;   
DOI  :  10.3390/forecast5010015
学科分类:陶瓷学
来源: mdpi
PDF
【 摘 要 】

Big search data offers the opportunity to identify new and potentially real-time measures and predictors of important political, geographic, social, cultural, economic, and epidemiological phenomena, measures that might serve an important role as leading indicators in forecasts and nowcasts. However, it also presents vast new risks that scientists or the public will identify meaningless and totally spurious ‘relationships’ between variables. This study is the first to quantify that risk in the context of search data. We find that spurious correlations arise at exceptionally high frequencies among probability distributions examined for random variables based upon gamma (1, 1) and Gaussian random walk distributions. Quantifying these spurious correlations and their likely magnitude for various distributions has value for several reasons. First, analysts can make progress toward accurate inference. Second, they can avoid unwarranted credulity. Third, they can demand appropriate disclosure from the study authors.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202307010003438ZK.pdf 1497KB PDF download
  文献评价指标  
  下载次数:10次 浏览次数:3次