期刊论文详细信息
Applied Sciences
A Study on High-Speed Outlier Detection Method of Network Abnormal Behavior Data Using Heterogeneous Multiple Classifiers
Seonghyeon Gong1  Jaeik Cho2  Ken Choi2 
[1] Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul 01811, Korea;Illinois Institute of Technology, Chicago, IL 60616, USA;
关键词: noise reduction;    outlier detection;    intrusion detection;    machine learning for IDS;   
DOI  :  10.3390/app12031011
来源: DOAJ
【 摘 要 】

As the complexity and scale of the network environment increase continuously, various methods to detect attacks and intrusions from network traffic by classifying normal and abnormal network behaviors show their limitations. The number of network traffic signatures is increasing exponentially to the extent that semi-realtime detection is not possible. However, machine learning-based intrusion detection only gives simple guidelines as simple contents of security events. This is why security data for a specific environment cannot be configured due to data noise, diversification, and continuous alteration of a system and network environments. Although machine learning is performed and evaluated using a generalized data set, its performance is expected to be similar in that specific network environment only. In this study, we propose a high-speed outlier detection method for a network dataset to customize the dataset in real-time for a continuously changing network environment. The proposed method uses an ensemble-based noise data filtering model using the voting results of 6 classifiers (decision tree, random forest, support vector machine, naive Bayes, k-nearest neighbors, and logistic regression) to reflect the distribution and various environmental characteristics of datasets. Moreover, to prove the performance of the proposed method, we experimented with the accuracy of attack detection by gradually reducing the noise data in the time series dataset. As a result of the experiment, the proposed method maintains a training dataset of a size capable of semi-real-time learning, which is 10% of the total training dataset, and at the same time, shows the same level of accuracy as a detection model using a large training dataset. The improved research results would be the basis for automatic tuning of network datasets and machine learning that can be applied to special-purpose environments and devices such as ICS environments.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次