BMC Genomics | |
Genome-enabled prediction using probabilistic neural network classifiers | |
Research Article | |
José Crossa1  Paulino Pérez-Rodríguez2  Juan Manuel González-Camacho2  Daniel Gianola3  Leonardo Ornella4  | |
[1] Biometrics and Statistics Unit (BSU), International Maize and Wheat Improvement Center (CIMMYT), Apdo Postal 6-641, 06600 24105, México DF, México;Colegio de Postgraduados, Campus Montecillo, 056230, Texcoco, México, México;Department of Animal Sciences, University of Wisconsin, 53706, Madison, USA;NIDERA SEMILLAS S.A., Ruta 8 Km. 376, 2600, Venado Tuerto, Argentina; | |
关键词: Average precision; Bayesian classifier; Genomic selection; Machine-learning algorithm; Multi-layer perceptron; Non-parametric model; | |
DOI : 10.1186/s12864-016-2553-1 | |
received in 2015-10-03, accepted in 2016-02-29, 发布年份 2016 | |
来源: Springer | |
【 摘 要 】
BackgroundMulti-layer perceptron (MLP) and radial basis function neural networks (RBFNN) have been shown to be effective in genome-enabled prediction. Here, we evaluated and compared the classification performance of an MLP classifier versus that of a probabilistic neural network (PNN), to predict the probability of membership of one individual in a phenotypic class of interest, using genomic and phenotypic data as input variables. We used 16 maize and 17 wheat genomic and phenotypic datasets with different trait-environment combinations (sample sizes ranged from 290 to 300 individuals) with 1.4 k and 55 k SNP chips. Classifiers were tested using continuous traits that were categorized into three classes (upper, middle and lower) based on the empirical distribution of each trait, constructed on the basis of two percentiles (15–85 % and 30–70 %). We focused on the 15 and 30 % percentiles for the upper and lower classes for selecting the best individuals, as commonly done in genomic selection. Wheat datasets were also used with two classes. The criteria for assessing the predictive accuracy of the two classifiers were the area under the receiver operating characteristic curve (AUC) and the area under the precision-recall curve (AUCpr). Parameters of both classifiers were estimated by optimizing the AUC for a specific class of interest.ResultsThe AUC and AUCpr criteria provided enough evidence to conclude that PNN was more accurate than MLP for assigning maize and wheat lines to the correct upper, middle or lower class for the complex traits analyzed. Results for the wheat datasets with continuous traits split into two and three classes showed that the performance of PNN with three classes was higher than with two classes when classifying individuals into the upper and lower (15 or 30 %) categories.ConclusionsThe PNN classifier outperformed the MLP classifier in all 33 (maize and wheat) datasets when using AUC and AUCpr for selecting individuals of a specific class. Use of PNN with Gaussian radial basis functions seems promising in genomic selection for identifying the best individuals. Categorizing continuous traits into three classes generally provided better classification than when using two classes, because classification accuracy improved when classes were balanced.
【 授权许可】
CC BY
© González-Camacho et al. 2016
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311092506253ZK.pdf | 2162KB | download | |
12864_2017_4186_Article_IEq16.gif | 1KB | Image | download |
12864_2015_2214_Article_IEq2.gif | 1KB | Image | download |
12864_2017_3670_Article_IEq8.gif | 1KB | Image | download |
12864_2017_3669_Article_IEq2.gif | 1KB | Image | download |
12864_2016_2756_Article_IEq2.gif | 1KB | Image | download |
【 图 表 】
12864_2016_2756_Article_IEq2.gif
12864_2017_3669_Article_IEq2.gif
12864_2017_3670_Article_IEq8.gif
12864_2015_2214_Article_IEq2.gif
12864_2017_4186_Article_IEq16.gif
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]