BMC Medical Research Methodology | |
Derivation and assessment of risk prediction models using case-cohort data | |
Lisa Pennells3  Thor Aspelund1  Ian R White2  Simon G Thompson3  Jean Sanderson3  | |
[1] Icelandic Heart Association, Kopavogur 201, Iceland;MRC Biostatistics Unit, Cambridge CB2 0SR, UK;Department of Public Health and Primary Care, Strangeways Research Laboratory, University of Cambridge, Worts Causeway, Cambridge CB1 8RN, UK | |
关键词: Cardiovascular disease; Reclassification; Discrimination; Risk prediction; Case-cohort; | |
Others : 1091771 DOI : 10.1186/1471-2288-13-113 |
|
received in 2013-01-16, accepted in 2013-09-09, 发布年份 2013 | |
【 摘 要 】
Background
Case-cohort studies are increasingly used to quantify the association of novel factors with disease risk. Conventional measures of predictive ability need modification for this design. We show how Harrell’s C-index, Royston’s D, and the category-based and continuous versions of the net reclassification index (NRI) can be adapted.
Methods
We simulated full cohort and case-cohort data, with sampling fractions ranging from 1% to 90%, using covariates from a cohort study of coronary heart disease, and two incidence rates. We then compared the accuracy and precision of the proposed risk prediction metrics.
Results
The C-index and D must be weighted in order to obtain unbiased results. The NRI does not need modification, provided that the relevant non-subcohort cases are excluded from the calculation. The empirical standard errors across simulations were consistent with analytical standard errors for the C-index and D but not for the NRI. Good relative efficiency of the prediction metrics was observed in our examples, provided the sampling fraction was above 40% for the C-index, 60% for D, or 30% for the NRI. Stata code is made available.
Conclusions
Case-cohort designs can be used to provide unbiased estimates of the C-index, D measure and NRI.
【 授权许可】
2013 Sanderson et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150128174209458.pdf | 362KB | download | |
Figure 4. | 37KB | Image | download |
Figure 3. | 27KB | Image | download |
Figure 2. | 49KB | Image | download |
Figure 1. | 26KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
【 参考文献 】
- [1]Prentince RL: A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 1986, 73:1-11.
- [2]Barlow WE, Ichikawa L, Rosner D, Izumi S: Analysis of case-cohort designs. J Clin Epidemiol 1999, 52:1165-1172.
- [3]Onland-Moret N, Vandera D, Vanderschouw Y, Buschers W, Elias S, Vangils C, Koerselman J, Roest M, Grobbee D, Peeters P: Analysis of case-cohort data: a comparison of different methods. J Clin Epidemiol 2007, 60:350-355.
- [4]Ganna A, Reilly M, de Faire U, Pedersen N, Magnusson P, Ingelsson E: Risk prediction measures for case-cohort and nested case–control designs: an application to cardiovascular disease. Am J Epidemiol 2012, 175:715-724.
- [5]Chambless LE, Diao G: Estimation of time-dependent area under the ROC curve for long-term risk prediction. Stat Med 2006, 25:3474-3486.
- [6]Folsom AR, Chambless LE, Ballantyne CM, Coresh J, Heiss G, Wu KK, Boerwinkle E, Mosley TH Jr, Sorlie P, Diao G, et al.: An assessment of incremental coronary risk prediction using C-reactive protein and other novel risk markers: the atherosclerosis risk in communities study. Arch Intern Med 2006, 166:1368-1373.
- [7]Herder C, Baumert J, Zierer A, Roden M, Meisinger C, Karakas M, Chambless L, Rathmann W, Peters A, Koenig W, et al.: Immunological and cardiometabolic risk factors in the prediction of type 2 diabetes and coronary events: MONICA/KORA Augsburg case-cohort study. PLoS One 2011, 6:e19852.
- [8]Vaarhorst AA, Lu Y, Heijmans BT, Dolle ME, Bohringer S, Putter H, Imholz S, Merry AH, van Greevenbroek MM, Jukema JW, et al.: Literature-based genetic risk scores for coronary heart disease: the Cardiovascular Registry Maastricht (CAREMA) prospective cohort study. Circ Cardiovasc Genet 2012, 5:202-209.
- [9]Danesh J, Saracci R, Berglund G, Feskens E, Overvad K, Panico S, Thompson S, Fournier A, Clavel-Chapelon F, Canonico M, et al.: EPIC-Heart: the cardiovascular component of a prospective study of nutritional, lifestyle and biological factors in 520,000 middle-aged participants from 10 European countries. Eur J Epidemiol 2007, 22:129-141.
- [10]Harrell FE Jr, Califf RM, Pryor DB, Lee KL, Rosati RA: Evaluating the yield of medical tests. JAMA 1982, 247:2543-2546.
- [11]Harrell FE Jr, Lee KL, Mark DB: Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996, 15:361-387.
- [12]Royston P, Sauerbrei W: A new measure of prognostic separation in survival data. Stat Med 2004, 23:723-748.
- [13]Pencina MJ, D’Agostino RB Sr, D’Agostino RB Jr, Vasan RS: Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med 2008, 27:157-172.
- [14]Pencina MJ, D’Agostino RB Sr, Steyerberg EW: Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat Med 2011, 30:11-21.
- [15]Jonsdottir LS, Sigfusson N, Gudnason V, Sigvaldason H, Thorgeirsson G: Do lipids, blood pressure, diabetes, and smoking confer equal risk of myocardial infarction in women as in men? The Reykjavik Study. J Cardiovasc Risk 2002, 9:67-76.
- [16]Cox DR: Regression Models and Life-Tables. J R Stat Soc Ser B Methodol 1972, 37:187-220.
- [17]Self SG, Prentice RL: Asymptotic distribution theory and efficiency results for case-cohort studies. Ann Stat 1988, 16:64-81.
- [18]Langholz B, Jiao J: Computational methods for case-cohort studies. Comput Stat Data Anal 2007, 51:3737-3748.
- [19]Kulathinal S, Karvanen J, Saarela O, Kuulasmaa K: Case-cohort design in practice - experiences from the MORGAM Project. Epidemiol Perspect Innov 2007, 4:15. BioMed Central Full Text
- [20]Graf E, Schmoor C, Sauerbrei W, Schumacher M: Assessment and comparison of prognostic classification schemes for survival data. Stat Med 1999, 18:2529-2545.
- [21]Schemper M, Stare J: Explained variation in survival analysis. Stat Med 1996, 15:1999-2012.
- [22]Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, Pencina MJ, Kattan MW: Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology 2010, 21:128-138.
- [23]Newson R: Confidence intervals for rank statistics: Somers’ D and extensions. Stata J 2006, 6:309-334.
- [24]Stata Statistical Software: Release 11. College Station, TX: StataCorp LP; 2009.
- [25]The Emerging Risk Factors Collaboration: Lipid-related markers and cardiovascular disease prediction. JAMA 2012, 307:2499-2506.
- [26]The Emerging Risk Factors Collaboration: C-reactive protein, fibrinogen, and cardiovascular disease prediction. NEJM 2012, 367:1310-1320.
- [27]Gonen M, Heller G: Concordance probability and discriminatory power in proportional hazards regression. Biometrika 2005, 92:965-970.
- [28]Wolbers M, Koller MT, Witteman JC, Steyerberg EW: Prognostic models with competing risks: methods and application to coronary risk prediction. Epidemiology 2009, 20:555-561.