期刊论文详细信息
BMC Medical Research Methodology
Aiming for a representative sample: Simulating random versus purposive strategies for hospital selection
Hendrik Koffijberg2  Kit C. B. Roes3  Mart P. Janssen1  Loan R. van Hoeven1 
[1] Sanquin Blood Supply, Transfusion Technology Assessment Department, Sanquin Research, Amsterdam, Universiteitsweg 100, Utrecht, 3584 CG, The Netherlands;Department of Health Technology & Services Research, MIRA Institute for biomedical technology and technical medicine, University of Twente, Drienerlolaan 5, Enschede, 7522, NB, The Netherlands;Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Universiteitsweg 100, Utrecht, 3584 CG, The Netherlands
关键词: Model-based inference;    Simulation;    Maximum variation;    Random vs. purposive sampling;    Representativeness;    Hospital selection;    Sampling strategy;   
Others  :  1230338
DOI  :  10.1186/s12874-015-0089-8
 received in 2015-06-01, accepted in 2015-10-19,  发布年份 2015
【 摘 要 】

Background

A ubiquitous issue in research is that of selecting a representative sample from the study population. While random sampling strategies are the gold standard, in practice, random sampling of participants is not always feasible nor necessarily the optimal choice. In our case, a selection must be made of 12 hospitals (out of 89 Dutch hospitals in total). With this selection of 12 hospitals, it should be possible to estimate blood use in the remaining hospitals as well. In this paper, we evaluate both random and purposive strategies for the case of estimating blood use in Dutch hospitals.

Methods

Available population-wide data on hospital blood use and number of hospital beds are used to simulate five sampling strategies: (1) select only the largest hospitals, (2) select the largest and the smallest hospitals (‘maximum variation’), (3) select hospitals randomly, (4) select hospitals from as many different geographic regions as possible, (5) select hospitals from only two regions. Simulations of each strategy result in different selections of hospitals, that are each used to estimate blood use in the remaining hospitals. The estimates are compared to the actual population values; the subsequent prediction errors are used to indicate the quality of the sampling strategy.

Results

The strategy leading to the lowest prediction error in the case study was maximum variation sampling, followed by random, regional variation and two-region sampling, with sampling the largest hospitals resulting in the worst performance. Maximum variation sampling led to a hospital level prediction error of 15 %, whereas random sampling led to a prediction error of 19 % (95 % CI 17 %-26 %). While lowering the sample size reduced the differences between maximum variation and the random strategies, increasing sample size to n = 18 did not change the ranking of the strategies and led to only slightly better predictions.

Conclusions

The optimal strategy for estimating blood use was maximum variation sampling. When proxy data are available, it is possible to evaluate random and purposive sampling strategies using simulations before the start of the study. The results enable researchers to make a more educated choice of an appropriate sampling strategy.

【 授权许可】

   
2015 van Hoeven et al.

附件列表
Files Size Format View
Fig. 2. 35KB Image download
Fig. 1. 23KB Image download
Fig. 2. 35KB Image download
Fig. 1. 23KB Image download
【 图 表 】

Fig. 1.

Fig. 2.

Fig. 1.

Fig. 2.

【 参考文献 】
  • [1]Levy PS, Lemeshow S. The population and the sample. In: Sampling of Populations: Methods and applications. 4th ed. John Wiley and Sons, New York, USA; 2008.
  • [2]Banerjee A, Chaudhury S. Statistics without tears: Populations and samples. Industrial Psychiatry Journal. 2010; 19:60-65.
  • [3]Kish L. New paradigms (models) for probability sampling. Survey methodology. 2002; 28(1):31-34.
  • [4]Tiwari N, Chilwal A. On Minimum Variance Optimal Controlled Sampling: A Simplified Approach. Journal of Statistical Theory and Practice. 2014; 8(4):692-706.
  • [5]Raaijmakers M, Koffijberg H, Posthumus J, van Hout B, van Engeland H, Matthys W. Assessing performance of a randomized versus a non-randomized study design. Contemp Clin Trials. 2008; 29:293-303.
  • [6]Topp L, Barker B, Degenhardt L. The external validity of results derived from ecstasy users recruited using purposive sampling strategies. Drug Alcohol Depend. 2004; 73:33-40.
  • [7]Morrison A, Stone DH. Injury surveillance in accident and emergency departments: to sample or not to sample? Inj Prev. 1998; 4:50-53.
  • [8]Wang RY, Strong DM. Beyond accuracy: What data quality means to data consumers. J manag inf syst. 1996;5–33.
  • [9]Kahn MG, Raebel MA, Glanz JM, Riedlinger K, Steiner JF. A pragmatic framework for single-site and multisite data quality assessment in electronic health record-based clinical research. Med Care. 2012; 50:S21-S29.
  • [10]O'Muircheartaigh C, Hedges LV. Generalizing from unrepresentative experiments: a stratified propensity score approach. J R Stat Soc: Ser C Appl Stat. 2014; 63(2):195-210.
  • [11]Aronow PM, Middleton JA. A class of unbiased estimators of the average treatment effect in randomized experiments. Journal of Causal Inference. 2013; 1:135-154.
  • [12]Adhya S, Banerjee T, Chattopadhyay G. Inference on Polychotomous Responses in Finite Populations. Scand J Stat. 2011; 38(4):788-800.
  • [13]Maghera A, Kahlke P, Lau A, Zeng Y, Hoskins C, Corbett T et al.. You are how you recruit: a cohort and randomized controlled trial of recruitment strategies. BMC Med Res Methodol. 2014; 14(1):111. BioMed Central Full Text
  • [14]Sherman KJ, Hawkes RJ, Ichikawa L, Cherkin DC, Deyo RA, Avins AL et al.. Comparing recruitment strategies in a study of acupuncture for chronic back pain. BMC Med Res Methodol. 2009; 9(1):69. BioMed Central Full Text
  • [15]Ikeda N, Shibuya K, Hashimoto H. Improving population health measurement in national household surveys: a simulation study of the sample design of the comprehensive survey of living conditions of the people on health and welfare in Japan. J Epidemiol. 2011; 21:385-390.
  • [16]Albert CH, Yoccoz NG, Edwards TC, Graham CH, Zimmermann NE, Thuiller W. Sampling in ecology and evolution – bridging the gap between theory and practice. Ecography. 2010; 33:1028-1037.
  • [17]Kruskal W, Mosteller F. Scientific Literature, Excluding Statistics. Int Stat Rev / Revue Internationale de Statistique. 1980; 47(2):111-127.
  • [18]Dutch Hospital Data. Kengetallen Nederlandse ziekenhuizen 2012. Utrecht. https://www.nvzziekenhuizen.nl/_library/13152/Kengetallen%20Nederlandse%20Ziekenhuizen%202012.pdf. Accessed 21 October 2015.
  • [19]Sanquin Blood Bank. Number of issued blood products per hospital 2013. Accessed in 2014.
  • [20]Nederlandse Federatie van Universitair Medische Centra. OOR-zaak en gevolg. Opleidingen in de zorg. NFU-visiedocument 053059. 2005. http://www.nfu.nl/img/pdf/NFU_Oorzaak.pdf. Accessed 5 November 2014.
  • [21]Buelens B, Boonstra HJ, Van den Brakel J, Daas P. Shifting paradigms in official statistics: from design-based to model-based to algorithmic inference. Discussion paper 201218, Statistics Netherlands, The Hague/Heerlen. 2012. http://www.cbs.nl/NR/rdonlyres/A94F8139-3DEE-45E3-AE38-772F8869DD8C/0/201218x10pub.pdf. Accessed 4 November 2014.
  • [22]Rothman KJ, Gallacher JE, Hatch EE. Why representativeness should be avoided. Int J Epidemiol. 2013; 42(4):1012-1014.
  • [23]Edwards TC, Cutler DR, Zimmermann NE, Geiser L, Moisen GG. Effects of sample survey design on the accuracy of classification tree models in species distribution models. Ecol Model. 2006; 199(2):132-141.
  • [24]Hirzel A, Guisan A. Which is the optimal sampling strategy for habitat suitability modeling? Ecol Model. 2002; 157:329-339.
  文献评价指标  
  下载次数:13次 浏览次数:14次