会议论文详细信息
AMIA 2012 Annual Symposium
Deterministic Binary Vectors for Efficient Automated Indexing ofMEDLINE/PubMed Abstracts
Manuel Wahle ; MS ; 1 Dominic Widdows ; PhD ; 2 Jorge R. Herskovic ; MD ; PhD ; 1 ; 3
PID  :  129455
来源: CEUR
PDF
【 摘 要 】
The need to maintain accessibility of the biomedical literature has led to development of methods to assist human indexers by recommending index terms for newly encountered articles. Given the rapid expansion of this literature, it is essential that these methods be scalable. Document vector representations are commonly used for automated indexing, and Random Indexing (RI) provides the means to generate them efficiently. However, RI is difficult to implement in realworld indexing systems, as (1) efficient nearestneighbor search requires retaining all document vectors in RAM, and (2) it is necessary to maintain a store of randomly generated term vectors to index future documents. Motivated by these concerns, this paper documents the development and evaluation of a deterministic binaryvariantofRI.Theincreasedcapacitydemonstratedby binary vectorshas implications forinformation retrieval, and the elimination of the need to retain term vectors facilitates distributed implementations, enhancing
【 预 览 】
附件列表
Files Size Format View
Deterministic Binary Vectors for Efficient Automated Indexing ofMEDLINE/PubMed Abstracts 187KB PDF download
  文献评价指标  
  下载次数:6次 浏览次数:0次