期刊论文详细信息
Revista Brasileira de Epidemiologia
Evaluation of different blocking strategies in probabilistic record linkage
Coeli, Cláudia Medina1  Camargo Jr., Kenneth Rochel de1 
[1] Universidade Federal do Rio de Janeiro
关键词: Database;    Probabilistic record linkage;    Blocking;    Epidemiology;   
DOI  :  10.1590/S1415-790X2002000200006
学科分类:过敏症与临床免疫学
来源: SciELO
PDF
【 摘 要 】

Blocking, that is, the creation of logical record blocks within the files to be linked, is one of the steps that have to be taken in the process of probabilistically linking large databases. This paper is aimed at comparing different blocking strategies and studying the effectiveness of a standardizing algorithm that we have developed, which uses the same spelling for similarly sounding first syllables of names. We linked a mortality database with information on 59,065 death reports with a hospital death report database with 531 records, which had corresponding entries in the larger database. Different blocking strategies were compared with regards to processing cost and the proportion of lost true matches. The multiple steps blocking strategy was more effective, allowing the identification of all the true matches, at the same time producing a total number of pairs which was smaller than the one obtained with the use of two different single-step strategies. Among the single-step strategies, the best result was achieved with the utilization of a key produced by a combination of the soundex codes of the first name and sex. The utilization of the algorithm that standardizes the spelling of similarly sounding first syllables of names produced no remarkable effects, both in terms of cost and reduction of the loss of true matches.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO201911300383088ZK.pdf 90KB PDF download
  文献评价指标  
  下载次数:10次 浏览次数:7次