Journal of Big Data | |
Autoencoder-kNN meta-model based data characterization approach for an automated selection of AI algorithms | |
Research | |
Mohamed Hamlich1  Mourad Bouneffa2  Adeel Ahmad2  Moncef Garouani3  | |
[1] CCPS Laboratory, ENSAM, University of Hassan II, Casablanca, Morocco;Univ. Littoral Côte d’Opale, UR 4491, LISIC, Laboratoire d’Informatique Signal et Image de la Cote d’Opale, 62100e, Calais, France;Univ. Littoral Côte d’Opale, UR 4491, LISIC, Laboratoire d’Informatique Signal et Image de la Cote d’Opale, 62100e, Calais, France;CCPS Laboratory, ENSAM, University of Hassan II, Casablanca, Morocco;Study and Research Center for Engineering and Management (CERIM), HESTIM, Casablanca, Morocco; | |
关键词: Algorithm selection; AutoML; Meta-learning; Meta-features; Data representation; kNN; Autoencoder; | |
DOI : 10.1186/s40537-023-00687-7 | |
received in 2021-12-09, accepted in 2023-01-21, 发布年份 2023 | |
来源: Springer | |
【 摘 要 】
The recent evolution of machine learning (ML) algorithms and the high level of expertise required to use them have fuelled the demand for non-experts solutions. The selection of an appropriate algorithm and the configuration of its hyperparameters is among the most complicated tasks while applying ML to new problems. It necessitates well awareness and knowledge of ML algorithms. The algorithm selection problem (ASP) is defined as the process of identifying the algorithm (s) that can deliver top performance for a particular problem, task, and evaluation measure. In this context, meta-learning is one of the approaches to achieve this objective by using prior learning experiences to assist the learning process on unseen problems and tasks. As a data-driven approach, appropriate data characterization is of vital importance for the meta-learning. Nonetheless, the recent literature witness a variety of data characterization techniques including simple, statistical and information theory based measures. However, their quality still needs to be improved. In this paper, a new Autoencoder-kNN (AeKNN) based meta-model with built-in latent features extraction is proposed. The approach is aimed to extract new characterizations of the data, with lower dimensionality but more significant and meaningful features. AeKNN internally uses a deep autoencoder as a latent features extractor from a set of existing meta-features induced from the dataset. From this new features vectors the computed distances are more significant, thus providing a way to accurately recommending top-performing pipelines for previously unseen datasets. In an application on a large-scale hyperparameters optimization task for 400 real world datasets with varying schemas as a meta-learning task, we show that AeKNN offers considerable improvements of the classical kNN as well as traditional meta-models in terms of performance.
【 授权许可】
CC BY
© The Author(s) 2023
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202305151743283ZK.pdf | 1713KB | download | |
Fig. 2 | 112KB | Image | download |
12888_2023_4583_Article_IEq1.gif | 1KB | Image | download |
Fig. 5 | 703KB | Image | download |
Fig. 6 | 377KB | Image | download |
Fig. 1 | 895KB | Image | download |
Fig. 7 | 673KB | Image | download |
Fig. 1 | 87KB | Image | download |
Fig. 2 | 4379KB | Image | download |
Fig. 4 | 452KB | Image | download |
Fig. 9 | 78KB | Image | download |
Fig. 1 | 361KB | Image | download |
MediaObjects/41408_2023_799_MOESM2_ESM.xlsx | 13KB | Other | download |
Fig. 3 | 74KB | Image | download |
Fig. 1 | 1009KB | Image | download |
Fig. 1 | 3236KB | Image | download |
Fig. 1 | 39KB | Image | download |
12302_2023_718_Article_IEq56.gif | 1KB | Image | download |
12302_2023_718_Article_IEq58.gif | 1KB | Image | download |
Fig. 4 | 810KB | Image | download |
MediaObjects/12960_2023_799_MOESM6_ESM.xlsx | 325KB | Other | download |
Fig. 6 | 270KB | Image | download |
Fig. 2 | 209KB | Image | download |
Fig. 4 | 437KB | Image | download |
Fig. 3 | 3725KB | Image | download |
Fig. 1 | 511KB | Image | download |
Fig. 2 | 1441KB | Image | download |
Fig. 1 | 270KB | Image | download |
MediaObjects/12936_2023_4503_MOESM1_ESM.pdf | 369KB | download | |
42004_2023_830_Article_IEq16.gif | 1KB | Image | download |
MediaObjects/13068_2023_2267_MOESM4_ESM.docx | 46KB | Other | download |
Fig. 3 | 282KB | Image | download |
Fig. 3 | 640KB | Image | download |
Fig. 2 | 164KB | Image | download |
Fig. 4 | 3876KB | Image | download |
Fig. 2 | 2654KB | Image | download |
Fig. 3 | 60KB | Image | download |
13073_2023_1154_Article_IEq1.gif | 1KB | Image | download |
376KB | Image | download | |
MediaObjects/12888_2023_4616_MOESM1_ESM.docx | 12KB | Other | download |
MediaObjects/12936_2023_4455_MOESM5_ESM.docx | 17KB | Other | download |
Fig. 3 | 109KB | Image | download |
MediaObjects/40644_2023_532_MOESM4_ESM.docx | 13KB | Other | download |
42004_2023_830_Article_IEq22.gif | 1KB | Image | download |
791KB | Image | download | |
Fig. 6 | 194KB | Image | download |
Fig. 2 | 326KB | Image | download |
Fig. 7 | 1210KB | Image | download |
Fig. 4 | 344KB | Image | download |
Fig. 5 | 92KB | Image | download |
【 图 表 】
Fig. 5
Fig. 4
Fig. 7
Fig. 2
Fig. 6
42004_2023_830_Article_IEq22.gif
Fig. 3
13073_2023_1154_Article_IEq1.gif
Fig. 3
Fig. 2
Fig. 4
Fig. 2
Fig. 3
Fig. 3
42004_2023_830_Article_IEq16.gif
Fig. 1
Fig. 2
Fig. 1
Fig. 3
Fig. 4
Fig. 2
Fig. 6
Fig. 4
12302_2023_718_Article_IEq58.gif
12302_2023_718_Article_IEq56.gif
Fig. 1
Fig. 1
Fig. 1
Fig. 3
Fig. 1
Fig. 9
Fig. 4
Fig. 2
Fig. 1
Fig. 7
Fig. 1
Fig. 6
Fig. 5
12888_2023_4583_Article_IEq1.gif
Fig. 2
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
- [47]
- [48]
- [49]
- [50]
- [51]
- [52]
- [53]