会议论文详细信息
International Conference on Informatics, Engineering, Science and Technology
Classification Consumer Credit for Missing Value Dataset
计算机科学;工业技术
Noviandi, I.^1 ; Sumitra, I.D.^1
Universitas Komputer Indonesia (UNIKOM), Bandung, Indonesia^1
关键词: Categorical data;    Classification algorithm;    Classification and regression tree;    Classification rates;    Consumer credits;    Customer profiles;    Decision-tree algorithm;    Logistic regressions;   
Others  :  https://iopscience.iop.org/article/10.1088/1757-899X/407/1/012173/pdf
DOI  :  10.1088/1757-899X/407/1/012173
来源: IOP
PDF
【 摘 要 】

The objective of the study is to find the best method to construct a model that could predict the future failure as a function of variables obtained from the customer profile. Decision Tree and Logistic Regression are classification algorithm. One of Decision Tree algorithm is Classification and Regression Tree (CART). It can used to analyze numeric and categorical data. Logistic Regression is more accurate than Decision Tree. In fact, there is some missing value in datasets. Amelia II is the best method to estimate missing value for numeric and categorical data. This study combines Amelia II to estimate missing value, Decision Tree to screening and re-categorization variable and Logistic Regression to classifying debtor into 'good' and 'bad' risk classes. We found that the accuracy of this combined method constant until 40% missing value. The Correct Classification Rate (CCR) value for 10% - 40% same as the CCR value for dataset without missing value. Otherwise, the accuracy decreased for missing value above 40%. This method is effective if missing value of the dataset below 40%. We recommend the bank to apply this method for classify risk of debtor if the missing value is below 40%.

【 预 览 】
附件列表
Files Size Format View
Classification Consumer Credit for Missing Value Dataset 520KB PDF download
  文献评价指标  
  下载次数:13次 浏览次数:44次