期刊论文详细信息
Austrian Journal of Statistics
Fast Approximate Complete-data k-nearest-neighbor Estimation
article
Alejandro Murua1  Nicolas Wicker1 
[1] University of Lille
关键词: big data;    dimension estimation;    distance distribution;    hash table;    high dimensionality;    symmetric k-nearest-neighbors.;   
DOI  :  10.17713/ajs.v49i2.907
学科分类:医学(综合)
来源: Austrian Statistical Society
PDF
【 摘 要 】

We introduce a fast method to estimate the complete-data set of k-nearest-neighbors.This is equivalent to finding an estimate of the k-nearest-neighbor graph of the data. The method relies on random normal projections. The k-nearest-neighbors are estimated by sorting points in a number of random lines. For very large datasets, the method is quasi-linear in the data size. As an application, we show that the intrinsic dimension of a manifold can be reliably estimated from the estimated set of k-nearest-neighbors in time about two orders of magnitude faster than when using the exact set of k-nearest-neighbors.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202105240000027ZK.pdf 261KB PDF download
  文献评价指标  
  下载次数:11次 浏览次数:1次