期刊论文详细信息
JOURNAL OF MULTIVARIATE ANALYSIS 卷:116
Intrinsic dimension identification via graph-theoretic methods
Article
Brito, M. R.1  Quiroz, A. J.2,3  Yukich, J. E.4 
[1] Univ Simon Bolivar, Dpto Matemat Puras & Aplicadas, Caracas 1080, Venezuela
[2] Univ Simon Bolivar, Dpto Computo Cientif Estadist, Caracas 1080, Venezuela
[3] Univ Los Andes, Dpto Matemat, Bogota, Colombia
[4] Lehigh Univ, Dept Math, Bethlehem, PA USA
关键词: Intrinsic dimension;    Graph theoretical methods;    Stabilization methods;    Dimensionality reduction;   
DOI  :  10.1016/j.jmva.2012.12.007
来源: Elsevier
PDF
【 摘 要 】

Three graph theoretical statistics are considered of the problem of estimating the intrinsic dimension of a data set. The first is the reach statistic, (r) over bar (j,k), proposed in Brito et al. (2002) [4] for the problem of identification of Euclidean dimension. The second, M-n is the sample average of squared degrees in the minimum spanning tree of the data, while the third statistic, U-n(k), is based on counting the number of common neighbors among the knearest, for each pair of sample points {X-i, X-j}, i < j <= n. For the first and third of these statistics, central limit theorems are proved under general assumptions, for data living in an m-dimensional C-1 submanifold of R-d, and in this setting, we establish the consistency of intrinsic dimension identification procedures based on <(r)over bar>(j,k) and U-n(k). For M-n asymptotic results are provided whenever data live in an affine subspace of Euclidean space. The graph theoretical methods proposed are compared, via simulations, with a host of recently proposed nearest neighbor alternatives. (C) 2013 Elsevier Inc. All rights reserved.

【 授权许可】

Free   

【 预 览 】
附件列表
Files Size Format View
10_1016_j_jmva_2012_12_007.pdf 483KB PDF download
  文献评价指标  
  下载次数:4次 浏览次数:0次