期刊论文详细信息
BMC Bioinformatics
Sparse data embedding and prediction by tropical matrix factorization
Hilal Kazan1  Polona Oblak2  Amra Omanović2  Tomaž Curk2 
[1] Department of Computer Engineering, Antalya Bilim University, Çıplaklı, Akdeniz Blv. No:290/A, 07190, Antalya, Turkey;Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, 1000, Ljubljana, Slovenia;
关键词: Data embedding;    Matrix factorization;    Tropical factorization;    Sparse data;    Matrix completion;    Tropical semiring;   
DOI  :  10.1186/s12859-021-04023-9
来源: Springer
PDF
【 摘 要 】

BackgroundMatrix factorization methods are linear models, with limited capability to model complex relations. In our work, we use tropical semiring to introduce non-linearity into matrix factorization models. We propose a method called Sparse Tropical Matrix Factorization (STMF) for the estimation of missing (unknown) values in sparse data.ResultsWe evaluate the efficiency of the STMF method on both synthetic data and biological data in the form of gene expression measurements downloaded from The Cancer Genome Atlas (TCGA) database. Tests on unique synthetic data showed that STMF approximation achieves a higher correlation than non-negative matrix factorization (NMF), which is unable to recover patterns effectively. On real data, STMF outperforms NMF on six out of nine gene expression datasets. While NMF assumes normal distribution and tends toward the mean value, STMF can better fit to extreme values and distributions.ConclusionSTMF is the first work that uses tropical semiring on sparse data. We show that in certain cases semirings are useful because they consider the structure, which is different and simpler to understand than it is with standard linear algebra.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202106292170100ZK.pdf 4196KB PDF download
  文献评价指标  
  下载次数:23次 浏览次数:9次