学位论文详细信息
Synthetic Data for Small Area Estimation.
Statistical Disclosure Control;Synthetic Data;Small Area Estimation;Hierarchical Model;Sequential Regression Multiple Imputation;Statistics and Numeric Data;Social Sciences;Survey Methodology
Sakshaug, Joseph WalterValliant, Richard L. ;
University of Michigan
关键词: Statistical Disclosure Control;    Synthetic Data;    Small Area Estimation;    Hierarchical Model;    Sequential Regression Multiple Imputation;    Statistics and Numeric Data;    Social Sciences;    Survey Methodology;   
Others  :  https://deepblue.lib.umich.edu/bitstream/handle/2027.42/89610/joesaks_1.pdf?sequence=1&isAllowed=y
瑞士|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Small area estimates provide a critical source of information used by a variety of stakeholders to study human conditions and behavior at the local level. Statistical agencies regularly collect survey microdata from small geographic areas but are prevented from identifying these areas in public-use microdata sets due to disclosure concerns. Alternative data dissemination methods include releasing summary tables for small areas and accessing restricted identifiers via Research Data Centers. This dissertation proposes a new method of disseminating public-use microdata that contains more geographical details than are currently being released. The basic idea is to replace the observed survey values with imputed, or synthetic, values. Data confidentiality is enhanced because no actual values are released. This dissertation proposes three statistical methods for generating synthetic data for small geographic areas. The first method utilizes a fully-parametric hierarchical Bayesian model that is used to generate synthetic microdata from the posterior predictive distribution. The second method consists of a nonparametric procedure for generating synthetic data for continuous non-normal distributions. The third method accounts for complex sample design features and permits the generation of synthetic data for both sampled and nonsampled small areas. These three methods are demonstrated and evaluated using a mix of public-use and restricted microdata from the American Community Survey and National Health Interview Survey. Each of the methods is evaluated using empirical, simulation, and cross-validation studies. The analytic validity of the methods is assessed by comparing the small area estimates obtained from the synthetic data with those obtained from the observed data.

【 预 览 】
附件列表
Files Size Format View
Synthetic Data for Small Area Estimation. 10991KB PDF download
  文献评价指标  
  下载次数:8次 浏览次数:19次