期刊论文详细信息
IEEE Access
An Out of Memory tSVD for Big-Data Factorization
Hristo Djidjev1  Erik Skau1  Gopinath Chennupati1  Boian Alexandrov2  Hector Carrillo-Cabada2 
[1] Los Alamos National Laboratory, Information Sciences (CCS-3) Group, Los Alamos, NM, USA;Los Alamos National Laboratory, Theoretical Division (T-1) Group, Los Alamos, NM, USA;
关键词: tSVD;    out of memory;    tensor train;    singular vectors;    tensor networks;   
DOI  :  10.1109/ACCESS.2020.3000508
来源: DOAJ
【 摘 要 】

Singular value decomposition (SVD) is a matrix factorization method widely used for dimension reduction, data analytics, information retrieval, and unsupervised learning. In general, only singular values of SVD are needed for most big-data applications. Methods such as tensor networks require an accurate computation of a substantial number of singular vectors, which can be accomplished through truncated SVD (tSVD). Additionally, many real-world datasets are too big to fit into the available memory, which mandates the development of out of memory algorithms that assume that most of the data resides on an external disk during the entire computation. These algorithms reduce communication to disk and hide part of the communication by overlapping it with communication on blocks of work. Here, building upon previous works on SVD for dense matrices, we present a method for computation of a predetermined number, K, of SVD singular vectors, and the corresponding K singular values, of a matrix that cannot fit in the memory. Our out of memory tSVD can be used for tensor networks algorithms. We describe ways for reducing the communication during the computation of the left and right reflectors, needed to compute the singular vectors, and introduce a method for estimating the block-sizes needed to hide the communication on parallel file systems.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:3次