期刊论文

【摘要】

In the analysis of current genomic data, application of machine learning and data mining techniques has become more attractive given the rising complexity of the projects. As part of the Genetic Analysis Workshop 19, approaches from this domain were explored, mostly motivated from two starting points. First, assuming an underlying structure in the genomic data, data mining might identify this and thus improve downstream association analyses. Second, computational methods for machine learning need to be developed further to efficiently deal with the current wealth of data.In the course of discussing results and experiences from the machine learning and data mining approaches, six common messages were extracted. These depict the current state of these approaches in the application to complex genomic data. Although some challenges remain for future studies, important forward steps were taken in the integration of different data types and the evaluation of the evidence. Mining the data for underlying genetic or phenotypic structure and using this information in subsequent analyses proved to be extremely helpful and is likely to become of even greater use with more complex data sets.

【授权许可】

CC BY
© König et al. 2015

【预览】

附件列表
Files	Size	Format	View
RO202311099510523ZK.pdf	351KB	PDF	download

【参考文献】

[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]

BMC Genetics
Machine learning and data mining in complex genomic data—a review on the lessons learned in Genetic Analysis Workshop 19
Proceedings
Emily R. Holzinger¹ Elizabeth Held² Nathan Tintle³ Jonathan Auerbach⁴ Rui Sun⁵ Marc-André Legault⁶ Damian Gola⁷ Inke R. König⁷ Hsin-Chou Yang⁸
[1] Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, 21224, Baltimore, MD, USA;Department of Mathematics, Iowa State University, 50011, Ames, IA, USA;Department of Mathematics, Statistics and Computer Science, Dordt College, 51250, Sioux Center, IA, USA;Department of Statistics, Columbia University, 10027, New York, NY, USA;Division of Biostatistics, School of Public Health and Primary Care, the Chinese University of Hong Kong, Shatin, SAR, Hong Kong;Faculty of Medicine, Université de Montréal, 2900 Chemin de la Tour, H3T 1N8, Montreal, QC, Canada;Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Universitätsklinikum Schleswig-Holstein, Campus Lübeck, Lübeck, Germany;Institute of Statistical Science, Academia Sinica, Nankang 115, Taipei, Taiwan;
关键词: Support Vector Machine; Random Forest; Rare Variant; Machine Learning Method; Multifactor Dimensionality Reduction;
DOI : 10.1186/s12863-015-0315-8
来源: Springer
PDF


	文献评价指标
	下载次数：6次	浏览次数：0次

【 摘 要 】

【 授权许可】

【 预 览 】

【 参考文献 】

【摘要】

【授权许可】

【预览】

【参考文献】