期刊论文详细信息
BMC Medical Research Methodology
A new approach to analyse longitudinal epidemiological data with an excess of zeros
Jos WR Twisk3  Martijn W Heymans3  Michiel R de Boer2  Tibor RS Hajos1  Alette S Spriensma3 
[1] Department of Medical Psychology, VU University Medical Centre, Van der Boechorststraat 7, Amsterdam, 1081 BT, The Netherlands;Department of Health Sciences, University of Groningen, Antonius Deusinglaan 1, Groningen, 9713 AV, The Netherlands;Department of Methodology and Applied Biostatistics, Faculty of Earth and Life Sciences, Institute of Health Sciences, VU University, de Boelelaan 1085, Amsterdam, 1081 HV, The Netherlands
关键词: Statistical methods;    Longitudinal;    Mixed modelling;    Count;    Excess of zeros;    Two-part joint model;   
Others  :  1126109
DOI  :  10.1186/1471-2288-13-27
 received in 2012-03-19, accepted in 2013-02-15,  发布年份 2013
PDF
【 摘 要 】

Background

Within longitudinal epidemiological research, ‘count’ outcome variables with an excess of zeros frequently occur. Although these outcomes are frequently analysed with a linear mixed model, or a Poisson mixed model, a two-part mixed model would be better in analysing outcome variables with an excess of zeros. Therefore, objective of this paper was to introduce the relatively ‘new’ method of two-part joint regression modelling in longitudinal data analysis for outcome variables with an excess of zeros, and to compare the performance of this method to current approaches.

Methods

Within an observational longitudinal dataset, we compared three techniques; two ‘standard’ approaches (a linear mixed model, and a Poisson mixed model), and a two-part joint mixed model (a binomial/Poisson mixed distribution model), including random intercepts and random slopes. Model fit indicators, and differences between predicted and observed values were used for comparisons. The analyses were performed with STATA using the GLLAMM procedure.

Results

Regarding the random intercept models, the two-part joint mixed model (binomial/Poisson) performed best. Adding random slopes for time to the models changed the sign of the regression coefficient for both the Poisson mixed model and the two-part joint mixed model (binomial/Poisson) and resulted into a much better fit.

Conclusion

This paper showed that a two-part joint mixed model is a more appropriate method to analyse longitudinal data with an excess of zeros compared to a linear mixed model and a Poisson mixed model. However, in a model with random slopes for time a Poisson mixed model also performed remarkably well.

【 授权许可】

   
2013 Spriensma et al; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150218065506931.pdf 459KB PDF download
Figure 3. 38KB Image download
Figure 2. 42KB Image download
Figure 1. 43KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

【 参考文献 】
  • [1]Winkelmann R: Econometric analysis of count data 5edition. Springer, Berlin; 2008.
  • [2]Olsen MK, Schafer JL: A two-part random-effects model for semicontinuous longitudinal data. J Am Stat Assoc 2001, 96(454):730-745.
  • [3]Berk KN, Lachenbruch PA: Repeated measures with zeros. Stat Methods Med Res 2002, 11(4):303-316.
  • [4]Bin Cheung Y: Zero-inflated models for regression analysis of count data: a study of growth and development. Stat Med 2002, 21(10):1461-1469.
  • [5]Ground M, Koch SF: Hurdle models of alcohol and tobacco expenditure in South African households. S Afr J Econ 2008, 76(1):132-143.
  • [6]Karazsia BT, van Dulmen MHM: Regression Models for Count Data: Illustrations using Longitudinal Predictors of Childhood Injury. J Pediatr Psychol 2008, 33(10):1076-1084.
  • [7]Lachenbruch PA: Analysis of data with excess zeros. Stat Methods Med Res 2002, 11(4):297-302.
  • [8]Lewsey JD, Thomson WM: The utility of the zero-inflated Poisson and zero-inflated negative binomial models: a case study of cross-sectional and longitudinal DMF data examining the effect of socio-economic status. Community Dent Oral 2004, 32(3):183-189.
  • [9]Madden D: Sample selection versus two-part models revisited: The case of female smoking and drinking. J Health Econ 2008, 27(2):300-307.
  • [10]Cohen AC: Estimating the Parameters of a Modified Poisson-Distribution. J Am Stat Assoc 1960, 55(289):139-143.
  • [11]Ghosh SK, Mukhopadhyay P, Lu JC: Bayesian analysis of zero-inflated regression models. J Stat Plan Infer 2006, 136(4):1360-1375.
  • [12]Heilbron DC: Generalized linear models for altered zero probabilities and over dispersion in count data. University of California, In San Francisco; 1989.
  • [13]Kemp AW: Weighted discrepancies and maximum likelihood estimation for discrete distribution. Commun Statist A—Theory and Methods 1986, 15:783-803.
  • [14]Lambert D: Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing. Technometrics 1992, 34(1):1-14.
  • [15]Martin DC, Katti SK: Fitting of some contagious distributions to some available data by the maximum likelihood method. Biometrics 1965, 21:34-48.
  • [16]Singh SN: A note on zero inflated Poisson distribution. Journal of the Indian Statistical Association and Society Manager 1963, 1:140-144.
  • [17]Hardin JW, Hilbe JM: Generalized linear models and extensions. 2nd edition. Stata Press, College Station; 2007.
  • [18]Long SJ, Freese J: Models for count outcomes. In Regression models for categorical dependent variables using stata. Stata Press, College Station; 2001:223-262.
  • [19]Moulton LH, Curriero FC, Barroso PF: Mixture models for quantitative HIV RNA data. Stat Methods Med Res 2002, 11(4):317-325.
  • [20]Rizopoulos D, Verbeke G, Lesaffre E, Vanrenterghem Y: A two-part joint model for the analysis of survival and longitudinal binary data with excess zeros. Biometrics 2008, 64(2):611-619.
  • [21]Zhou XH: Inferences about population means of health care costs. Stat Methods Med Res 2002, 11(4):327-339.
  • [22]Zhou XH, Tu WZ: Comparison of several independent population means when their samples contain log-normal and possibly zero observations. Biometrics 1999, 55(2):645-651.
  • [23]Hajos TRS, Pouwer F, de Grooth R, Holleman F, Twisk JWR, Diamant M, Snoek FJ: Initiation of insulin glargine in patients with Type 2 diabetes in suboptimal glycaemic control positively impacts health-related quality of life. A prospective cohort study in primary care. Diabetic Med 2011, 28(9):1096-1102.
  • [24]Twisk JWR: Applied longitudinal data analysis for epidemiology. A practical guide. Cambridge University Press, Cambridge; 2003.
  • [25]Verbeke G, Molenberghs G: Linear mixed models for longitudinal data. Springer, New York; 2000.
  • [26]Rabe-Hesketh S, Skrondal A: Multilevel and longitudinal modeling using stata. Stata Press, College Station; 2005.
  • [27]Skrondal A, Rabe-Hesketh S: Counts. Chapman & Hall/CRC, Boca Raton; 2004. [Generalized latent variable modeling: multilevel, longitudinal, and structural equation models]
  • [28]Dalrymple ML, Hudson IL, Ford RPK: Finite mixture, zero-inflated Poisson and hurdle models with application to SIDS. Comput Stat Data An 2003, 41(3–4):491-504.
  • [29]Gurmu S: Generalized hurdle count data regression models. Econ Lett 1998, 58(3):263-268.
  • [30]Lee AH, Wang K, Scott JA, Yau KKW, McLachlan GJ: Multi-level zero-inflated Poisson regression modelling of correlated count data with excess zeros. Stat Methods Med Res 2006, 15(1):47-61.
  • [31]Miranda A: FIML estimation of an endogenous switching model for count data. Stata J 2004, 4(1):40-49.
  • [32]Miranda A, Rabe-Hesketh S: Maximum likelihood estimation of endogenous switching and sample selection models for binary, ordinal, and count variables. Stata J 2006, 6(3):285-308.
  • [33]Terza JV: Estimating count data models with endogenous switching: Sample selection and endogenous treatment effects. J Econometrics 1998, 84(1):129-154.
  • [34]Terza JV, Kenkel DS, Lin TF, Sakata S: Care-giver advice as a preventive measure for drinking during pregnancy: Zeros, categorical outcome responses, and endogeneity. Health Econ 2008, 17(1):41-54.
  • [35]Tooze JA, Grunwald GK, Jones RH: Analysis of repeated measures data with clumping at zero. Stat Methods Med Res 2002, 11(4):341-355.
  • [36]Yau KKW, Lee AH: Zero-inflated Poisson regression with random effects to evaluate an occupational injury prevention programme. Stat Med 2001, 20(19):2907-2920.
  • [37]Schwarz G: Estimating Dimension of a Model. Ann Stat 1978, 6(2):461-464.
  • [38]Stata Corporation: Stata Statistical Software. 11th edition. Stata Press, College Station; 2007.
  • [39]Rabe-Hesketh S, Pickles A, Skrondal A: GLAMM manual. The Berkeley Electronic Press, Berkeley; 2004.
  • [40]SPSS Inc.: PASW Statistics for windows. 18th edition. SPSS Inc., Chicago; 2009.
  • [41]Mcdonald JF, Moffitt RA: The Uses of Tobit Analysis. Rev Econ Stat 1980, 62(2):318-321.
  • [42]Roncek DW: Learning More from Tobit Coefficients - Extending a Comparative-Analysis of Political Protest. Am Sociol Rev 1992, 57(4):503-507.
  • [43]Zhang Y, Wang Y, Nadaraja S: Nonlinear tobit decomposition. Economic Quality Control 2006, 21(2):271-277.
  文献评价指标  
  下载次数:98次 浏览次数:33次