期刊论文详细信息
International Journal of Health Geographics
Empowering health geography research with location-based social media data: innovative food word expansion and energy density prediction via word embedding and machine learning
Research
Kevin Chen-Chuan Chang1  Gyoorie Kim2  Jue Wang2 
[1] Department of Computer Science, University of Illinois at Urbana-Champaign, 201 North Goodwin Avenue, Urbana, IL, USA;Department of Geography and Planning, University of Toronto, 100 St. George Street, M5S 3G3, Toronto, ON, Canada;Department of Geography, Geomatics and Environment, University of Toronto Mississauga, 3359 Mississauga Road, L5L 1C6, Mississauga, ON, Canada;
关键词: Food environment;    Food words;    Food energy density;    Machine learning;    Health geography;    Geographic information science;   
DOI  :  10.1186/s12942-023-00344-5
 received in 2023-03-25, accepted in 2023-09-01,  发布年份 2023
来源: Springer
PDF
【 摘 要 】

BackgroundThe exponential growth of location-based social media (LBSM) data has ushered in novel prospects for investigating the urban food environment in health geography research. However, previous studies have primarily relied on word dictionaries with a limited number of food words and employed common-sense categorizations to determine the healthiness of those words. To enhance the analysis of the urban food environment using LBSM data, it is crucial to develop a more comprehensive list of food-related words. Within the context, this study delves into the exploration of expanding food-related words along with their associated energy densities.MethodsThis study addresses the aforementioned research gap by introducing a novel methodology for expanding the food-related word dictionary and predicting energy densities. Seed words are generated from official and crowdsourced food composition databases, and new food words are discovered by clustering food words within the word embedding space using the Gaussian mixture model. Machine learning models are employed to predict the energy density classifications of these food words based on their feature vectors. To ensure a thorough exploration of the prediction problem, ten widely used machine learning models are evaluated.ResultsThe approach successfully expands the food-related word dictionary and accurately predicts food energy density (reaching 91.62%.). Through a comparison of the newly expanded dictionary with the initial seed words and an analysis of Yelp reviews in the city of Toronto, we observe significant improvements in identifying food words and gaining a deeper understanding of the food environment.ConclusionsThis study proposes a novel method to expand food-related vocabulary and predict the food energy density based on machine learning and word embedding. This method makes a valuable contribution to building a more comprehensive list of food words that can be used in geography and public health studies by mining geotagged social media data.

【 授权许可】

CC BY   
© BioMed Central Ltd., part of Springer Nature 2023

【 预 览 】
附件列表
Files Size Format View
RO202310112174536ZK.pdf 1929KB PDF download
Fig. 2 187KB Image download
MediaObjects/12888_2023_5187_MOESM2_ESM.docx 17KB Other download
MediaObjects/12888_2023_5170_MOESM2_ESM.docx 955KB Other download
13690_2023_1170_Article_IEq188.gif 1KB Image download
MediaObjects/12974_2023_2867_MOESM8_ESM.jpg 28KB Other download
Table 2 138KB Table download
Fig. 1 1945KB Image download
Table 3 78KB Table download
Fig. 1 368KB Image download
13063_2023_7612_Figx_HTML.png 4KB Image download
MediaObjects/13046_2023_2810_MOESM1_ESM.pdf 2800KB PDF download
Fig. 1 91KB Image download
12888_2023_5172_Article_IEq2.gif 1KB Image download
12936_2023_4724_Article_IEq33.gif 1KB Image download
Fig. 1 379KB Image download
【 图 表 】

Fig. 1

12936_2023_4724_Article_IEq33.gif

12888_2023_5172_Article_IEq2.gif

Fig. 1

13063_2023_7612_Figx_HTML.png

Fig. 1

Fig. 1

13690_2023_1170_Article_IEq188.gif

Fig. 2

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  • [55]
  • [56]
  • [57]
  • [58]
  • [59]
  • [60]
  • [61]
  • [62]
  • [63]
  • [64]
  • [65]
  • [66]
  • [67]
  • [68]
  • [69]
  • [70]
  • [71]
  • [72]
  • [73]
  • [74]
  • [75]
  • [76]
  • [77]
  • [78]
  • [79]
  • [80]
  文献评价指标  
  下载次数:1次 浏览次数:3次