BMC Medical Research Methodology | |
Why item response theory should be used for longitudinal questionnaire data analysis in medical research | |
Jos W. R. Twisk2  Jean-Paul Fox1  Rosalie Gorter2  | |
[1] Department of Research Methodology, Measurement, and Data Analysis, Faculty of Behavioral, Management & Social Sciences, University of Twente, Enschede, Netherlands;EMGO+ institute for health and care research, Amsterdam, Netherlands | |
关键词: Multilevel model; Plausible values; Structural model; Measurement error; Questionnaires; Item response theory; Hierarchical model; Longitudinal data; | |
Others : 1222442 DOI : 10.1186/s12874-015-0050-x |
|
received in 2015-01-22, accepted in 2015-07-13, 发布年份 2015 | |
【 摘 要 】
Background
Multi-item questionnaires are important instruments for monitoring health in epidemiological longitudinal studies. Mostly sum-scores are used as a summary measure for these multi-item questionnaires. The objective of this study was to show the negative impact of using sum-score based longitudinal data analysis instead of Item Response Theory (IRT)-based plausible values.
Methods
In a simulation study (varying the number of items, sample size, and distribution of the outcomes) the parameter estimates resulting from both modeling techniques were compared to the true values. Next, the models were applied to an example dataset from the Amsterdam Growth and Health Longitudinal Study (AGHLS).
Results
The results show that using sum-scores leads to overestimation of the within person (repeated measurement) variance and underestimation of the between person variance.
Conclusions
We recommend using IRT-based plausible value techniques for analyzing repeatedly measured multi-item questionnaire data.
【 授权许可】
2015 Gorter et al.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150821033438695.pdf | 1210KB | download | |
Fig. 7. | 49KB | Image | download |
Fig. 6. | 14KB | Image | download |
Fig. 5. | 72KB | Image | download |
Fig. 4. | 96KB | Image | download |
Fig. 3. | 20KB | Image | download |
Fig. 2. | 9KB | Image | download |
Fig. 1. | 12KB | Image | download |
【 图 表 】
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
【 参考文献 】
- [1]Lin F-J, Pickard A, Krishnan J, Joo M, Au D, Carson S et al.. Measuring health-related quality of life in chronic obstructive pulmonary disease: properties of the EQ-5D-5 L and PROMIS-43 short form. BMC Med Res Methodol. 2014; 14:78. BioMed Central Full Text
- [2]Marrero D, Pan Q, Barrett-Connor E, de Groot M, Zhang P, Percy C et al.. Impact of diagnosis of diabetes on health-related quality of life among high risk individuals: the Diabetes Prevention Program outcomes study. Qual Life Res. 2014; 23:75-88.
- [3]Pronk M, Deeg D, Smits C, Twisk J, van Tilburg T, Festen J et al.. Hearing loss in older persons: does the rate of decline affect psychosocial health? J Aging Health. 2014; 26:703-723.
- [4]Bryk A, Raudenbush S. Application of hierarchical linear models to assessing change. Psychol Bull 1987;101:147–58.
- [5]Kreft I, de Leeuw J, van der Leeden R. Review: review of five multilevel analysis programs: BMDP-5 V, GENMOD, HLM, ML3, VARCL. Am Stat. 1994; 48:324-335.
- [6]Goldstein H. Multilevel Models in Education and Social Research. University Press, Oxford; 1987.
- [7]Twisk J. Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide. University Press, Cambridge; 2013.
- [8]Twisk J. Applied Multilevel Analysis: A Practical Guide for Medical Researchers. University Press, Cambridge; 2006.
- [9]Tuerlinckx F, Rijmen F, Verbeke G, De Boeck P. Statistical inference in generalized linear mixed models: a review. Br J Math Stat Psychol. 2006; 59:225-255.
- [10]Kim J-H, Lee W-Y, Hong Y-P, Ryu W-S, Lee K, Lee W-S, et al. Psychometric properties of a short self-reported measure of medication adherence among patients with hypertension treated in a busy clinical setting in Korea. J Epidemiol. 2014;24:132–40.
- [11]Golubic R, May A, Benjaminsen Borch K, Overvad K, Charles M-A, Diaz M et al.. Validity of electronically administered recent physical activity questionnaire (RPAQ) in ten European countries. PLoS One. 2014; 9: Article ID e92829
- [12]Leach L, Olesen S, Butterworth P, Poyser C. New fatherhood and psychological distress: a longitudinal study of Australian men. Am J Epidemiol. 2014; 180:582-589.
- [13]Najman J, Khatun M, Mamun A, Clavarino A, Williams G, Scott J et al.. Does depression experienced by mothers leads to a decline in marital quality: a 21-year longitudinal study. Soc Psychiatry Psychiatr Epidemiol. 2014; 49:121-132.
- [14]Astell-Burt T, Mitchell R, Hartig T. The association between green space and mental health varies across the lifecourse. A longitudinal study. J Epidemiol Community Health. 2014; 68:578-83.
- [15]Jarvik J, Comstock B, Heagerty P, Turner J, Sullivan S, Shi X et al.. Back pain in seniors: the Back pain Outcomes using Longitudinal Data (BOLD) cohort baseline data. BMC Musculoskelet Disord. 2014; 15:134. BioMed Central Full Text
- [16]Fox J-P. Multilevel IRT using dichotomous and polytomous response data. Br J Math Stat Psychol. 2005;58(1):145–72.
- [17]Fox J-P. Bayesian modeling of measurement error in predictor variables using item response theory. Psychometrika. 2003; 68:169-191.
- [18]Von Davier M, Gonzalez E, Mislevy R. What are plausible values and why are they useful? IERI Monogr Ser. 2009;9–36.
- [19]Glas C, Geerlings H, van de Laar M, Taal E. Analysis of longitudinal randomized clinical trials using item response models. Contemp Clin Trials. 2009; 30:158-70.
- [20]Mislevy R. Randomization-based inference about latent variables from complex samples. Psychometrika. 1991; 56:177-196.
- [21]Martin M, Mullis I. TIMSS and PIRLS Achievement Scaling Methodology. In: Methods and procedures in TIMMS and PIRLS. TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College, Chestnut Hill, MA; 2011: p.1-11.
- [22]Rubin D, Schenker N. Multiple imputation for interval estimation from simple random samples with ignorable nonresponse. J Am Stat Assoc. 1986; 81:366-374.
- [23]Wijnstok N, Hoekstra T, van Mechelen W, Kemper H, Twisk J. Cohort profile: the Amsterdam growth and health longitudinal study. Int J Epidemiol. 2012; 42:1-8.
- [24]Lord F, Novick M, Birnbaum A. Statistical Theories of Mental Test Scores. Addison-Wesley Publishing Company, Inc.; 1968
- [25]Van Nispen R, Knol D, Neve H, van Rens G. A multilevel item response theory model was investigated for longitudinal vision-related quality-of-life data. J Clin Epidemiol. 2010; 63:321-330.
- [26]Van Nispen R, Knol D, Langelaan M, de Boer M, Terwee C, van Rens G. Applying multilevel item response theory to vision-related quality of life in Dutch visually impaired elderly. Optom Vis Sci. 2007; 84:710-720.
- [27]Fox J-P, Glas C. Bayesian modification indices for IRT models. Stat Neerl. 2005; 59:95-106.
- [28]Verhagen J, Fox J-P. Longitudinal measurement in health-related surveys. A Bayesian joint growth model for multivariate ordinal responses. Stat Med. 2013; 32:2988-3005.
- [29]Reise S. Using multilevel logistic regression to evaluate person-Fit in IRT models probability trait level. Multivariate Behav Res. 2000; 35:543-568.
- [30]Hays R, Morales L, Reise S. Item response theory and health outcomes measurement in the 21st century. Med Care. 2000;38(9 Suppl):1128.
- [31]Fox J-P. Multilevel IRT Modeling in Practice with the package mlirt. J Stat Softw. 2007;20(5):1–16.
- [32]Samejima F. Estimation of latent ability using a response pattern of graded scores. Psychom Monogr Suppl. 1969; 34:100.
- [33]Embretson S, Reise S. Item Response Theory for Psychologists. L.Erbaum Associates; 2000
- [34]Pastor D. Longitudinal rasch modeling in the context of psychotherapy outcomes assessment. Appl Psychol Meas. 2006; 30:100-120.
- [35]Bayes T. An essay towards solving a problem in the doctrine of chances. Philos Trans R Soc. 1763; 53:370-418.
- [36]Asparouhov T, Muthén B. Plausible values for latent variables using Mplus. 2010.
- [37]Rubin D. The calculation of posterior distributions by data augmentation: Comment: A noniterative sampling/importance resampling alternative to the data augmentation. J Am Stat Assoc. 1987; 82:543-546.
- [38]Little R, Rubin D. Statistical Analysis with Missing Data. Whiley & Sons; 2002
- [39]Kolen M, Brennan R. Test Equating, Scaling, and Linking. Methods and Practices. 2nd edition. Springer; 2010.
- [40]Team R. R: A language and environment for statistical computing. 2012.
- [41]Sturtz S, Gelman A, Ligges U. R2WinBUGS : a package for running WinBUGS. J Stat Softw. 2005; 12:1-16.
- [42]Lunn D, Thomas A, Best N, Spiegelhalter D. WinBUGS-a Bayesian modelling framework: concepts, structure, and extensibility. Stat Comput. 2000; 10:325-337.
- [43]Kemper H, Hof M van’t. Design of a multiple longitudinal study of growth and health in teenagers. Eur J Pediatr. 1978;155:147–55.
- [44]Hoekstra T, Barbosa-leiker C, Koppes L, Twisk J. Developmental trajectories of body mass index throughout the life course : an application of Latent Class Growth (Mixture) Modelling. Longitunal Life Course Stud. 2011;2(3):319-30.
- [45]Douw L, Nieboer D, van Dijk B, Stam C, Twisk J. A healthy brain in a healthy body: brain network correlates of physical and mental fitness. PLoS One. 2014; 9: Article ID e88202
- [46]Wijnstok N, Hoekstra T, Eringa E, Smulders Y, Twisk J, Serne E. The relationship of body fatness and body fat distribution with microvascular recruitment: The Amsterdam Growth and Health Longitudinal Study. Microcirculation. 2012; 19:273-279.
- [47]Wijnstok N, Serné E, Hoekstra T, Schouten F, Smulders Y, Twisk J. The relationship between 30-year developmental patterns of body fat and body fat distribution and its vascular properties: the Amsterdam Growth and Health Longitudinal Study. Nutr Diabetes. 2013; 3: Article ID e90
- [48]Twisk J, Kemper H, van Mechelen W, Post G. Tracking of risk factors for coronary heart disease over a 14-year period: a comparison between lifestyle and biologic risk factors with data from the Amsterdam growth and health study. Am J Epidemiol. 1997; 145:888-898.
- [49]Twisk J, Staal B, Brinkman M, Kemper H, van Mechelen W. Tracking of lung function parameters and the longitudinal relationship with lifestyle. Eur Respir J. 1998; 12:627-634.
- [50]Hoekstra T, Barbosa-Leiker C, Twisk J. Vital exhaustion and markers of low-grade inflammation in healthy adults: the Amsterdam Growth and Health Longitudinal Study. Stress Heal. 2013; 29:392-400.
- [51]Van der Ploeg H. De Zelf-Beoordelings Vragenlijst angst (STAY-DY). Tijdschr Psychiatr. 1982; 24:189-199.
- [52]King M, Bell M, Costa D, Butow P, Oh B. The Quality of Life Questionnaire Core 30 (QLQ-C30) and Functional Assessment of Cancer-General (FACT-G) differ in responsiveness, relative efficiency, and therefore required sample size. J Clin Epidemiol. 2014; 67:100-7.
- [53]Hertzog C, van Alstine J. Measurement properties of the Center for Epidemiological Studies Depression Scale (CES-D) in older populations. Psychol assessmen a J Consult Clin Psychol. 1990; 2:64-72.
- [54]Dawson J, Linsell L, Zondervan K, Rose P, Randall T, Carr A et al.. Epidemiology of hip and knee pain and its impact on overall health status in older adults. Rheumatology. 2004; 43:497-504.
- [55]Blanchin M, Hardouin J, Le Neel T, Kubis G, Blanchard C, Mirallié E et al.. Comparison of CTT and Rasch-based approaches for the analysis of longitudinal Patient Reported Outcomes. Stat Med. 2011; 30:825-38.
- [56]Zhang S, Paul J, Nantha-Aree M, Buckley N, Shahzad U, DeBeer J, et al. Empirical comparison of four baseline covariate adjustment methods in analysis of continuous outcomes in randomized controlled trials. Clin Epidemiol. 2014;6:227–35.