| PATTERN RECOGNITION | 卷:118 |
| A review of methods for imbalance d multi-lab el classification | |
| Review | |
| Tarekegn, Adane Nega1,4  Giacobini, Mario2  Michalak, Krzysztof3  | |
| [1] Univ Turin, Dept Math, Modelling & Data Sci Program, Turin, Italy | |
| [2] Univ Turin, Dept Vet Sci, Data Anal & Modeling Unit, Turin, Italy | |
| [3] Wroclaw Univ Econ, Dept Informat Technol, Wroclaw, Poland | |
| [4] Bahir Dar Univ, Bahir Inst Technol, Fac Comp, Bahir, Ethiopia | |
| 关键词: Imbalanced Data; Multi-label Classification; Imbalanced Classification; Machine learning; Imbalanced Approaches; Review on Imbalanced Classification; | |
| DOI : 10.1016/j.patcog.2021.107965 | |
| 来源: Elsevier | |
PDF
|
|
【 摘 要 】
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2. Methods and statistical trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.1. Data sources and search strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2. Selection of studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.3. Statistical trends. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3. Characteristics of imbalanced multi-label datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.1. Imbalance problems in MLD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.2. Characterization measures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4. Approaches for imbalanced multi-label classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Multi-Label Classification (MLC) is an extension of the standard single-label classification where each data instance is associated with several labels simultaneously. MLC has gained much importance in re-cent years due to its wide range of application domains. However, the class imbalance problem has be-come an inherent characteristic of many multi-label datasets, where the samples and their correspond-ing labels are non-uniformly distributed over the data space. The imbalanced problem in MLC imposes challenges to multi-label data analytics which can be viewed from three perspectives: imbalance within labels, among labels, and label-sets. In this paper, we provide a review of the approaches for handling the imbalance problem in multi-label data by collecting the existing research work. As the first system-atic study of approaches addressing an imbalanced problem in MLC, this paper provides a comprehensive survey of the state-of-the-art methods for imbalanced MLC, including the characteristics of imbalanced multi-label datasets, evaluation measures and comparative analysis of the proposed methods. The study also discusses important results reported so far in the literature and highlights some of their strengths and limitations to guide future research. (c) 2021 Elsevier Ltd. All rights reserved.
【 授权许可】
Free
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| 10_1016_j_patcog_2021_107965.pdf | 1073KB |
PDF