| IEEE Access | |
| Online Feature Selection for Streaming Features Using Self-Adaption Sliding-Window Sampling | |
| Song Deng1  Xindong Wu2  Zhen Chen3  Chuan Ma3  Limin Shen3  Dianlong You3  Qiusheng Lian3  | |
| [1] Institute of Advanced Technology, Nanjing University Posts and Telecommunications, Nanjing, China;School of Computing and Informatics, University of Louisiana at Lafayette, Lafayette, LA, USA;School of Information Science and Engineering, Yanshan University, Qinhuangdao, China; | |
| 关键词: Feature selection; markov blanket; online learning; sliding-window; streaming feature; | |
| DOI : 10.1109/ACCESS.2019.2894121 | |
| 来源: DOAJ | |
【 摘 要 】
In recent years, online feature selection has been a research topic on streaming feature mining, as it can reduce the dimensionality of the streaming features by removing the irrelevant and redundant features in real time. There are many representative research efforts on the online feature selection with streaming features, i.e., alpha - investing, online streaming feature selection (OSFS), and scalable and accurate online approach (SAOLA) for feature selection. In these studies, alpha-investing has limited prediction accuracy and a large number of selected features. SAOLA sometimes offers outstanding efficiency in running time and prediction accuracy but possesses a large number of selected features. OSFS offers high prediction accuracy in many datasets, but its running time increases exponentially with an increasing number of features with low redundancy and high relevance. To address the limitations of the above-mentioned works, we propose an online learning algorithm named OSFAS, which samples streaming features in real-time by a self-adaption sliding-window and discards the irrelevant and redundant features by conditional independence. The OSFAS obtains an approximate Markov blanket with high prediction accuracy, meanwhile reducing the number of selected features. The efficiency of the proposed OSFASW algorithm was validated in a performance test on widely used datasets, e.g., NIPS2003 and causality workbench. Through the extensive experimental results, we demonstrate that OSFAS significantly improves the prediction accuracy and requires a smaller number of selected features than alpha - investing, OSFS, and SAOLA.
【 授权许可】
Unknown