学位论文详细信息
Recognizing cardiovascular disease patterns with machine learning using NHANES accelerometer determined physical activity data
Machine learning;accelerometers;physical activity recommendations;cardiovascular disease risk;Reynolds risk score;classification algorithms;feature selection;random forest;decision tree;support vector machine;lasso regression;neural network;NHANES
Boiarskaia, Elena
关键词: Machine learning;    accelerometers;    physical activity recommendations;    cardiovascular disease risk;    Reynolds risk score;    classification algorithms;    feature selection;    random forest;    decision tree;    support vector machine;    lasso regression;    neural network;    NHANES;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/92805/BOIARSKAIA-DISSERTATION-2016.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】
The relationship between physical activity (PA) and cardiovascular disease (CVD) is well established; however, questions about the appropriate dose of PA to reduce CVD risk still remain (Blair, LaMonte, & Nichaman, 2004; Pate et al., 1995). The optimal dose and the effects of intensity, duration, and frequency of PA are not fully understood (Haskell et al., 2007). This study connects objectively measured PA with a cross-sectional measure of CVD risk for an in-depth analysis of PA patterns that contribute to higher risk of CVD. Specifically, this study applied machine learning algorithms to NHANES accelerometer data from the 2003-2006 cohorts with the Reynolds cardiovascular risk score as the outcome. Using accelerometer data as a proxy for the Reynold's risk score to study cardiovascular disease risk allows the use of cross-sectional data when the longitudinal outcome is not known. A major benefit of using accelerometers to objectively measure of PA is that the data is easy and inexpensive to obtain. Furthermore, most locomotive activities are measured with a high degree of accuracy. Accelerometers can gather highly detailed information about an individual’s PA pattern over extended periods of time. This produces a large amount of data that requires specialized techniques to analyze. The analysis for this study was conducted using a variety of machine learning techniques to identify individual patterns in the data and evaluate what contributes most to high CVD risk. Comparison of machine learning algorithms shows that all classifiers perform well when given appropriate features. Using predefined intensity thresholds to compute average time spent in a PA category yielded good classification results in identifying study participants at high and low risk for CVD (Troiano et al., 2008). Adding PA pattern-related features to the model did not appear to improve classification. Features derived using k-means and the Hidden Markov Model (HMM) performed on the level of using predefined intensity thresholds, indicating that data driven methods may be used for feature extraction without relying on prior knowledge of the data. In general, the lasso regression, support vector machines (SVM) and random forest (RF) classifiers all performed well on large sets of data-driven features, achieving greater than 82% classification accuracy when time spent in PA intensity categories was combined with k-means and HMM-derived inputs. Neural networks performed well on smaller uncorrelated feature sets, and decision trees produced consistent results with the most transparency and interpretability.With respect to physical activity recommendations, the findings indicate that gender and time spent in lifestyle minutes (760-2019 intensity counts) play a key role in classifying CVD risk. Thus, a greater emphasis on gender specific recommendations focusing on lifestyle minutes in addition to moderate and vigorous activity may be necessary.Furthermore, time spent in the activity categories, not how PA is spread throughout the day and week appear to be most important for classification of CVD risk.
【 预 览 】
附件列表
Files Size Format View
Recognizing cardiovascular disease patterns with machine learning using NHANES accelerometer determined physical activity data 3005KB PDF download
  文献评价指标  
  下载次数:17次 浏览次数:44次