期刊论文详细信息
BioMedical Engineering OnLine
Machine learning, medical diagnosis, and biomedical engineering research - commentary
Kenneth R Foster1  Robert Koprowski3  Joseph D Skufca2 
[1] Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA
[2] Department of Mathematics & Computer Science, Clarkson University, Box 5815, Potsdam, NY 13699-5815, USA
[3] Department of Biomedical Computer Systems, University of Silesia, Faculty of Computer Science and Materials Science, Institute of Computer Science, ul.Będzińska 39, Sosnowiec 41-200, Poland
关键词: Support vector machine;    Machine learning;    Image processing;    Classifiers;    Artificial intelligence;   
Others  :  1097991
DOI  :  10.1186/1475-925X-13-94
 received in 2014-05-08, accepted in 2014-06-26,  发布年份 2014
PDF
【 摘 要 】

A large number of papers are appearing in the biomedical engineering literature that describe the use of machine learning techniques to develop classifiers for detection or diagnosis of disease. However, the usefulness of this approach in developing clinically validated diagnostic techniques so far has been limited and the methods are prone to overfitting and other problems which may not be immediately apparent to the investigators. This commentary is intended to help sensitize investigators as well as readers and reviewers of papers to some potential pitfalls in the development of classifiers, and suggests steps that researchers can take to help avoid these problems. Building classifiers should be viewed not simply as an add-on statistical analysis, but as part and parcel of the experimental process. Validation of classifiers for diagnostic applications should be considered as part of a much larger process of establishing the clinical validity of the diagnostic technique.

【 授权许可】

   
2014 Foster et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150131012142313.pdf 692KB PDF download
Figure 2. 119KB Image download
Figure 1. 104KB Image download
【 图 表 】

Figure 1.

Figure 2.

【 参考文献 】
  • [1]Broadhurst DI, Kell DB: Statistical strategies for avoiding false discoveries in metabolomics and related experiments. Metabolomics 2006, 2(4):171-196.
  • [2]Duda R, Hart P, Stork D: Pattern Classification. 2nd edition. New York, NY: John Wiley & Sons, Inc.; 2001.
  • [3]Breiman L, Friedman J, Olshen R, Stone C: Classification and Regression Trees. Boca Raton, FL: CRC Press; 1984.
  • [4]Tadeusiewicz R, Ogiela MR: Automatic understanding of medical images new achievements in syntactic analysis of selected medical images. Biocybern Biomed Eng 2002, 22(4):17-29.
  • [5]Kunchewa LI: Combining Pattern Classifiers, Methods and Algorithms. Hoboken, New Jersey: John Wiley & Sons, Inc.; 2004.
  • [6]Mitchel T: Machine Learning. New York NY: McGraw Hill; 1997.
  • [7]Schapire RE: The Boosting Approach to Machine Learning. Springer, New York NY: An Overview. In Nonlinear Estimation and Classification; 2003.
  • [8]Freund Y, Schapire RE: A Short Introduction to Boosting. J Jpn Soc Artif Intell 1999, 14(5):771-780.
  • [9]McLachlan GJ: Discriminant Analysis and Statistical Pattern Recognition. Hoboken NJ USA: Wiley-Interscience; 2004.
  • [10]Cyran KA, Kawulok J, Kawulok M, Stawarz M, Michalak M, Pietrowska M, Polańska J: Support Vector Machines in Biomedical and Biometrical Applications. In Emerging Paradigms in Machine Learning. Springer Berlin Heidelberg 2013, 13:379-417.
  • [11]Zhang GP: Neural networks for classification: a survey, IEEE Transactions on Systems, Man, and Cybernetics, Part C. Applications and Reviews 2000, 30(4):451-462.
  • [12]Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR: A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol 1996, 49(12):1373-1379.
  • [13]Smialowski P, Frishman D, Kramer S: Pitfalls of supervised feature selection. Bioinformatics 2010, 26(3):440-443.
  • [14]Dash M, Liu H: Feature Selection for Classification, in Intelligent Data Analysis. New York: Elsevier; 1997:131-156.
  • [15]Ennett CM, Frize M: Selective Sampling to Overcome Skewed a priori Probabilities. Proceed AMIA Symposium 2000, 225-229.
  • [16]Kohavi R: A study of cross-validation and bootstrap for accuracy estimation and model selection. IJCAI 1995, 14(2):1137-1145.
  • [17]Steyerberg EW, Bleeker SE, Moll HA, Grobbee DE, Moons KG: Internal and external validation of predictive models: a simulation study of bias and precision in small samples. J Clin Epidemiol 2003, 56(5):441-447.
  • [18]Koprowski R, Zieleźnik W, Wróbel Z, Małyszek J, Stepien B, Wójcik W: Assessment of significance of features acquired from thyroid ultrasonograms in Hashimoto's disease. BioMed Eng OnLine 2012, 11:48. BioMed Central Full Text
  • [19]Weigand AS, Rumelhart DE, Huberman BA: Generalization by weight elimination with application to forecasting. In Advances in Neural Information Processing Systems. Volume 3. Edited by Lippmann RP, Moody J, Touretzky DS. San Mateo: Morgan Kaufman; 1991::875-882.
  • [20]Moskowitz M, Feig SA, Cole-Beuglet V, Fox SH, Haberman JD, Libshitz HI, Zermeno A: Evaluation of new imaging procedures for breast cancer: proper process. Am J Roentgenol 1983, 140(3):591-594.
  文献评价指标  
  下载次数:149次 浏览次数:104次