期刊论文详细信息
BMC Bioinformatics
robustica: customizable robust independent component analysis
Software
Samuel Miravet-Verde1  Sarah A. Head1  Miquel Anglada-Girotto1  Luis Serrano2 
[1] Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain;Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain;Universitat Pompeu Fabra (UPF), Barcelona, Spain;ICREA, Pg. LLuís Companys 23, 08010, Barcelona, Spain;
关键词: Bioinformatics;    Independent component analysis;    Clustering;    Unsupervised learning;    Low-grade glioma;    Python;   
DOI  :  10.1186/s12859-022-05043-9
 received in 2022-02-15, accepted in 2022-11-08,  发布年份 2022
来源: Springer
PDF
【 摘 要 】

BackgroundIndependent Component Analysis (ICA) allows the dissection of omic datasets into modules that help to interpret global molecular signatures. The inherent randomness of this algorithm can be overcome by clustering many iterations of ICA together to obtain robust components. Existing algorithms for robust ICA are dependent on the choice of clustering method and on computing a potentially biased and large Pearson distance matrix.ResultsWe present robustica, a Python-based package to compute robust independent components with a fully customizable clustering algorithm and distance metric. Here, we exploited its customizability to revisit and optimize robust ICA systematically. Of the 6 popular clustering algorithms considered, DBSCAN performed the best at clustering independent components across ICA iterations. To enable using Euclidean distances, we created a subroutine that infers and corrects the components’ signs across ICA iterations. Our subroutine increased the resolution, robustness, and computational efficiency of the algorithm. Finally, we show the applicability of robustica by dissecting over 500 tumor samples from low-grade glioma (LGG) patients, where we define two new gene expression modules with key modulators of tumor progression upon IDH1 and TP53 mutagenesis.Conclusionrobustica brings precise, efficient, and customizable robust ICA into the Python toolbox. Through its customizability, we explored how different clustering algorithms and distance metrics can further optimize robust ICA. Then, we showcased how robustica can be used to discover gene modules associated with combinations of features of biological interest. Taken together, given the broad applicability of ICA for omic data analysis, we envision robustica will facilitate the seamless computation and integration of robust independent components in large pipelines.

【 授权许可】

CC BY   
© The Author(s) 2022

【 预 览 】
附件列表
Files Size Format View
RO202305064309230ZK.pdf 1511KB PDF download
12982_2022_119_Article_IEq69.gif 1KB Image download
12982_2022_119_Article_IEq206.gif 1KB Image download
【 图 表 】

12982_2022_119_Article_IEq206.gif

12982_2022_119_Article_IEq69.gif

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  文献评价指标  
  下载次数:2次 浏览次数:7次