期刊论文详细信息
EPJ Data Science
Leveraging augmentation techniques for tasks with unbalancedness within the financial domain: a two-level ensemble approach
Regular Article
Golshid Ranjbaran1  Gianfranco Lombardo2  Diego Reforgiato Recupero3  Sergio Consoli4 
[1] Department of Electrical and Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran;Department of Engineering and Architecture, University of Parma, Parco Area delle Scienze, 43125, Parma, Italy;Department of Mathematics and Computer Science, University of Cagliari, via Ospedale 72, 09121, Cagliari, Italy;Joint Research Centre (DG JRC), European Commission, Via E. Fermi 2749, 21027, Ispra (VA), Italy;
关键词: Augmentation techniques;    Ensemble method;    Financial sector;    Machine learning;    Unbalanced data;   
DOI  :  10.1140/epjds/s13688-023-00402-9
 received in 2022-11-14, accepted in 2023-06-26,  发布年份 2023
来源: Springer
PDF
【 摘 要 】

Modern financial markets produce massive datasets that need to be analysed using new modelling techniques like those from (deep) Machine Learning and Artificial Intelligence. The common goal of these techniques is to forecast the behaviour of the market, which can be translated into various classification tasks, such as, for instance, predicting the likelihood of companies’ bankruptcy or in fraud detection systems. However, it is often the case that real-world financial data are unbalanced, meaning that the classes’ distribution is not equally represented in such datasets. This gives the main issue since any Machine Learning model is trained according to the majority class mainly, leading to inaccurate predictions. In this paper, we explore different data augmentation techniques to deal with very unbalanced financial data. We consider a number of publicly available datasets, then apply state-of-the-art augmentation strategies to them, and finally evaluate the results for several Machine Learning models trained on the sampled data. The performance of the various approaches is evaluated according to their accuracy, micro, and macro F1 score, and finally by analyzing the precision and recall over the minority class. We show that a consistent and accurate improvement is achieved when data augmentation is employed. The obtained classification results look promising and indicate the efficiency of augmentation strategies on financial tasks. On the basis of these results, we present an approach focused on classification tasks within the financial domain that takes a dataset as input, identifies what kind of augmentation technique to use, and then applies an ensemble of all the augmentation techniques of the identified type to the input dataset along with an ensemble of different methods to tackle the underlying classification.

【 授权许可】

CC BY   
© The Author(s) 2023

【 预 览 】
附件列表
Files Size Format View
RO202309146910551ZK.pdf 2023KB PDF download
MediaObjects/12944_2023_1855_MOESM2_ESM.tif 1499KB Other download
MediaObjects/42358_2023_298_MOESM1_ESM.docx 30046KB Other download
Fig. 8 2144KB Image download
Fig. 2 811KB Image download
Fig. 1 82KB Image download
Fig. 6 1908KB Image download
Fig. 2 78KB Image download
MediaObjects/12864_2023_9455_MOESM1_ESM.pdf 91KB PDF download
MediaObjects/12902_2023_1407_MOESM2_ESM.tif 102KB Other download
Fig. 4 302KB Image download
Fig. 4 623KB Image download
42492_2023_141_Article_IEq14.gif 1KB Image download
519KB Image download
Fig. 3 107KB Image download
Fig. 4 1857KB Image download
Fig. 1 1977KB Image download
Fig. 1 1996KB Image download
Fig. 1 120KB Image download
Fig. 9 157KB Image download
Fig. 10 168KB Image download
Fig. 1 108KB Image download
Fig. 3 135KB Image download
Fig. 11 114KB Image download
MediaObjects/13750_2019_160_MOESM4_ESM.xlsx 83KB Other download
Fig. 12 156KB Image download
Fig. 14 1335KB Image download
Fig. 3 51KB Image download
Fig. 14 89KB Image download
Fig. 5 147KB Image download
Fig. 2 1411KB Image download
MediaObjects/42004_2023_937_MOESM2_ESM.docx 12KB Other download
Fig. 16 155KB Image download
Fig. 3 2979KB Image download
Fig. 17 751KB Image download
Fig. 8 101KB Image download
Fig. 7 35KB Image download
Fig. 1 1280KB Image download
Fig. 8 1814KB Image download
Fig. 6 1792KB Image download
Fig. 18 121KB Image download
Fig. 4 1020KB Image download
Fig. 19 104KB Image download
Fig. 20 123KB Image download
Fig. 16 746KB Image download
Fig. 4 152KB Image download
Fig. 8 2173KB Image download
Fig. 3 429KB Image download
42492_2023_141_Article_IEq65.gif 1KB Image download
MediaObjects/12951_2023_1957_MOESM3_ESM.tif 6104KB Other download
Fig. 1 275KB Image download
Fig. 9 490KB Image download
【 图 表 】

Fig. 9

Fig. 1

42492_2023_141_Article_IEq65.gif

Fig. 3

Fig. 8

Fig. 4

Fig. 16

Fig. 20

Fig. 19

Fig. 4

Fig. 18

Fig. 6

Fig. 8

Fig. 1

Fig. 7

Fig. 8

Fig. 17

Fig. 3

Fig. 16

Fig. 2

Fig. 5

Fig. 14

Fig. 3

Fig. 14

Fig. 12

Fig. 11

Fig. 3

Fig. 1

Fig. 10

Fig. 9

Fig. 1

Fig. 1

Fig. 1

Fig. 4

Fig. 3

42492_2023_141_Article_IEq14.gif

Fig. 4

Fig. 4

Fig. 2

Fig. 6

Fig. 1

Fig. 2

Fig. 8

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  • [55]
  • [56]
  • [57]
  • [58]
  • [59]
  • [60]
  • [61]
  • [62]
  • [63]
  • [64]
  • [65]
  • [66]
  • [67]
  • [68]
  • [69]
  • [70]
  • [71]
  • [72]
  • [73]
  • [74]
  • [75]
  • [76]
  • [77]
  • [78]
  • [79]
  • [80]
  • [81]
  • [82]
  • [83]
  • [84]
  文献评价指标  
  下载次数:0次 浏览次数:2次