期刊论文详细信息
Algorithms
A Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification
Dario Piga1  Michela Sperti2  Marco A. Deriu2  Francesca Venturini3  Umberto Michelucci3 
[1] IDSIA—Dalle Molle Institute for Artificial Intelligence, USI-SUPSI, Via la Santa 1, 6962 Lugano, Switzerland;PolitoBIOMed Lab, Department of Mechanical and Aerospace Engineering, Politecnico di Torino, 10129 Turin, Italy;TOELT LLC, Machine Learning Research and Development, Birchlenstr. 25, 8600 Dübendorf, Switzerland;
关键词: machine learning;    intrinsic limits;    ROC curve;    binary classification;    area under the curve;    Naïve Bayes classifier;   
DOI  :  10.3390/a14110301
来源: DOAJ
【 摘 要 】

This paper presents the intrinsic limit determination algorithm (ILD Algorithm), a novel technique to determine the best possible performance, measured in terms of the AUC (area under the ROC curve) and accuracy, that can be obtained from a specific dataset in a binary classification problem with categorical features regardless of the model used. This limit, namely, the Bayes error, is completely independent of any model used and describes an intrinsic property of the dataset. The ILD algorithm thus provides important information regarding the prediction limits of any binary classification algorithm when applied to the considered dataset. In this paper, the algorithm is described in detail, its entire mathematical framework is presented and the pseudocode is given to facilitate its implementation. Finally, an example with a real dataset is given.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:1次