Frequent pattern mining is to find patterns that are founded frequently in a dataset. This mining problem has various applications: market basket analysis, bio-informatics, index selection etc. FP-growth is one of the well-known algorithms used for frequent pattern mining. FP-growth finds frequent patterns efficiently using a tree-based data structure called FP-tree. However, the memory usage of FP-growth is high because of many pointers stored in the tree.In the thesis, we proposed a memory-efficient array-based data structure called Array-set and DFS mining which is an algorithm to support mining operation with reasonable time. With experimental study, we compared the two algorithms and discussed the relationship between two data structures and efficiencies of memory and time for the frequent pattern mining problem. Our experimental results confirm following observations. The construction time as well as the memory usage of the Array-set are smaller than those of the FP-tree while the mining time of the Array-set are greater than those of the FP-tree. Our research to take the place of a tree to an array can be applied to a research to improve the memory efficiency of the algorithm which uses a tree.
【 预 览 】
附件列表
Files
Size
Format
View
DFS mining: array-based solution for frequent pattern mining