BMC Bioinformatics | |
SparSNP: Fast and memory-efficient analysis of all SNPs for phenotype prediction | |
Software | |
Michael Inouye1  Justin Zobel2  Gad Abraham2  Adam Kowalczyk2  | |
[1] Immunology Division, The Walter and Eliza Hall Institute of Medical Research, 3052, Parkville, Victoria, Australia;Departments of Pathology and of Microbiology & Immunology, The University of Melbourne, 3010, Parkville, Victoria, Australia;NICTA Victoria Research Lab, Department of Computing and Information Systems, The University of Melbourne, 3010, Parkville, Victoria, Australia; | |
关键词: Celiac Disease; Loss Function; Lasso; Coordinate Descent; Double Exponential; | |
DOI : 10.1186/1471-2105-13-88 | |
received in 2012-01-09, accepted in 2012-05-10, 发布年份 2012 | |
来源: Springer | |
【 摘 要 】
BackgroundA central goal of genomics is to predict phenotypic variation from genetic variation. Fitting predictive models to genome-wide and whole genome single nucleotide polymorphism (SNP) profiles allows us to estimate the predictive power of the SNPs and potentially develop diagnostic models for disease. However, many current datasets cannot be analysed with standard tools due to their large size.ResultsWe introduce SparSNP, a tool for fitting lasso linear models for massive SNP datasets quickly and with very low memory requirements. In analysis on a large celiac disease case/control dataset, we show that SparSNP runs substantially faster than four other state-of-the-art tools for fitting large scale penalised models. SparSNP was one of only two tools that could successfully fit models to the entire celiac disease dataset, and it did so with superior performance. Compared with the other tools, the models generated by SparSNP had better than or equal to predictive performance in cross-validation.ConclusionsGenomic datasets are rapidly increasing in size, rendering existing approaches to model fitting impractical due to their prohibitive time or memory requirements. This study shows that SparSNP is an essential addition to the genomic analysis toolkit.SparSNP is available at http://www.genomics.csse.unimelb.edu.au/SparSNP
【 授权许可】
CC BY
© Abraham et al.; licensee BioMed Central Ltd. 2012. This article is published under license to BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311104933236ZK.pdf | 502KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]