期刊论文详细信息
BMC Medical Research Methodology
Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data
Mahesh KB Parmar1  Patrick Royston1  Rachel C. Jinks1 
[1] MRC Clinical Trials Unit at UCL, Aviation House, 125 Kingsway, London, WC2B 6NH, UK
关键词: Multivariable models;    Survival data;    Sample size;    Prognostic modelling;   
Others  :  1228739
DOI  :  10.1186/s12874-015-0078-y
 received in 2014-11-06, accepted in 2015-10-02,  发布年份 2015
【 摘 要 】

Background

Prognostic studies of time-to-event data, where researchers aim to develop or validate multivariable prognostic models in order to predict survival, are commonly seen in the medical literature; however, most are performed retrospectively and few consider sample size prior to analysis. Events per variable rules are sometimes cited, but these are based on bias and coverage of confidence intervals for model terms, which are not of primary interest when developing a model to predict outcome. In this paper we aim to develop sample size recommendations for multivariable models of time-to-event data, based on their prognostic ability.

Methods

We derive formulae for determining the sample size required for multivariable prognostic models in time-to-event data, based on a measure of discrimination, D, developed by Royston and Sauerbrei. These formulae fall into two categories: either based on the significance of the value of D in a new study compared to a previous estimate, or based on the precision of the estimate of D in a new study in terms of confidence interval width. Using simulation we show that they give the desired power and type I error and are not affected by random censoring. Additionally, we conduct a literature review to collate published values of D in different disease areas.

Results

We illustrate our methods using parameters from a published prognostic study in liver cancer. The resulting sample sizes can be large, and we suggest controlling study size by expressing the desired accuracy in the new study as a relative value as well as an absolute value. To improve usability we use the values of D obtained from the literature review to develop an equation to approximately convert the commonly reported Harrell’s c-index to D. A flow chart is provided to aid decision making when using these methods.

Conclusion

We have developed a suite of sample size calculations based on the prognostic ability of a survival model, rather than the magnitude or significance of model coefficients. We have taken care to develop the practical utility of the calculations and give recommendations for their use in contemporary clinical research.

【 授权许可】

   
2015 Jinks et al.

附件列表
Files Size Format View
Fig. 6. 25KB Image download
Fig. 5. 19KB Image download
Fig. 4. 18KB Image download
Fig. 3. 28KB Image download
Fig. 2. 30KB Image download
Fig. 1. 18KB Image download
【 图 表 】

Fig. 1.

Fig. 2.

Fig. 3.

Fig. 4.

Fig. 5.

Fig. 6.

【 参考文献 】
  • [1]Moons KG, Royston P, Vergouwe Y, Grobbee DE, Altman DG. Prognosis and prognostic research: what, why, and how? BMJ. 2009; 338:1317-20.
  • [2]Mallett S, Royston P, Dutton S, Waters R, Altman DG. Reporting methods in studies developing prognostic models in cancer: a review. BMC Med. 2010; 8:20+. BioMed Central Full Text
  • [3]Altman DG. Prognostic models: a methodological framework and review of models for breast cancer. Cancer Invest. 2009; 27(3):235-43.
  • [4]Altman DG, Lyman GH. Methodological challenges in the evaluation of prognostic factors in breast cancer. Breast Cancer Res Treat. 1998; 52(1-3):289-303.
  • [5]McGuire WL. Breast cancer prognostic factors: evaluation guidelines. J Natl Cancer Inst. 1991; 83(3):154-5.
  • [6]Royston P, Moons KG, Altman DG, Vergouwe Y. Prognosis and prognostic research: developing a prognostic model. BMJ. 2009; 338:1373-7.
  • [7]Riley RD, Hayden JA, Steyerberg EW, Moons KGM, Abrams K, Kyzas PA et al.. For the PROGRESS group: Prognosis research strategy (PROGRESS) 2: Prognostic factor research. PLoS Med. 2013; 10(2):e1001380+.
  • [8]Schoenfeld DA. Sample-size formula for the proportional-hazards regression model. Biometrics. 1983; 39(2):499-503.
  • [9]Schmoor C, Sauerbrei W, Schumacher M. Sample size considerations for the evaluation of prognostic factors in survival analysis. Stat Med. 2000; 19(4):441-452.
  • [10]Bernardo MVP, Lipsitz SR, Harrington DP, Catalano PJ. Sample size calculations for failure time random variables in non-randomized studies. J R Stat Soc (Series D): The Statistician. 2000; 49:31-40.
  • [11]Hsieh F, Lavori PW. Sample-size calculations for the Cox proportional hazards regression model with nonbinary covariates. Control Clin Trials. 2000; 21(6):552-60.
  • [12]Concato J, Peduzzi P, Holford TR, Feinstein AR. Importance of events per independent variable in proportional hazards analysis. I. Background, goals, and general strategy. J Clin Epidemiol. 1995; 48(12):1495-1501.
  • [13]Peduzzi P, Concato J, Feinstein AR, Holford TR. Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates. J Clin Epidemiol. 1995; 48(12):1503-10.
  • [14]Vittinghoff E, McCulloch CE. Relaxing the rule of ten events per variable in logistic and Cox regression. Am J Epidemiol. 2007; 165(6):710-8.
  • [15]Copas JB. Regression, prediction and shrinkage. J R Stat Soc Ser B Methodol. 1983; 45(3):311-54.
  • [16]Smith LR, Harrell FE, Muhlbaier LH. Problems and potentials in modeling survival. In: Grady ML, Schwartz HA, editors. Medical Effectiveness Research Data Methods (Summary Report) AHCPR publication, no. 92-0056. US Dept of Health and Human Services, Agency for Health Care Policy and Research: 1992. p. 151–159.
  • [17]Ambler G, Seaman S, Omar RZ. An evaluation of penalised survival methods for developing prognostic models with rare events. Stat Med. 2012; 31:1150-61.
  • [18]Vergouwe Y, Steyerberg EW, Eijkemans MJ, Habbema JDF. Substantial effect sample sizes were required for external validation studies of predictive logistic regression models. J Clin Epidemiol. 2005; 58:475-83.
  • [19]Royston P, Sauerbrei W. A new measure of prognostic separation in survival data. Stat Med. 2004; 23(5):723-48.
  • [20]Choodari-Oskooei B, Royston P, Parmar MK. A simulation study of predictive ability measures in a survival model. Stat Med. 2012; 31(23):2627-43.
  • [21]Harrell FE, Lee KL, Califf RM, Pryor DB, Rosati RA. Regression modelling strategies for improved prognostic prediction. Stat Med. 1984; 3(2):143-52.
  • [22]Gönen M, Heller G. Concordance probability and discriminatory power in proportional hazards regression. Biometrika. 2005; 92(4):965-70.
  • [23]Jinks RC. Sample size for multivariable prognostic models: PhD thesis, University College London; 2012.
  • [24]Kent JT, O’Quigley J. Measures of dependence for censored survival data. Biometrika. 1988; 75(3):525-34.
  • [25]Royston P. Explained variation for survival models. Stata J. 2006; 6:1-14.
  • [26]Armitage P, Berry G, Matthews JN. Statistical Methods in Medical Research. Blackwell Science, Oxford; 2001.
  • [27]Volinsky CT, Raftery AE. Bayesian Information Criterion for Censored Survival Models. Biometrics. 2000; 56:256-62.
  • [28]Collette S, Bonnetain F, Paoletti X, Doffoel M, Bouché O, Raoul JL et al.. Prognosis of advanced hepatocellular carcinoma: comparison of three staging systems in two French clinical trials. Ann Oncol. 2008; 19(6):1117-26.
  • [29]Vergouwe Y, Moons KGM, Steyerberg EW. External validity of risk models: use of benchmark values to disentangle a case-mix Effect from incorrect coefficients. Am J Epidemiol. 2010; 172(2):971-80.
  • [30]Steyerberg EW. Clinical Prediction Models: A Practical Approach to Development, Validation, and Updating (Statistics for Biology and Health), 1st ed.: Springer; 2008.
  • [31]Bender R, Augustin T, Blettner M. Generating survival times to simulate Cox proportional hazards models. Stat Med. 2005; 24(11):1713-23.
  文献评价指标  
  下载次数:31次 浏览次数:17次