International Conference on Computing and Applied Informatics 2016 | |
Comparison between BIDE, PrefixSpan, and TRuleGrowth for Mining of Indonesian Text | |
物理学;计算机科学 | |
Sa'Adillah Maylawati, Dian^1 ; Irfan, Mohamad^1 ; Budiawan Zulfikar, Wildan^1 | |
Departement of Informatics, State Islamic University of Bandung, Indonesia^1 | |
关键词: Bag of words; Indonesian languages; Indonesians; Prefix spans; Sequential patterns; | |
Others : https://iopscience.iop.org/article/10.1088/1742-6596/801/1/012067/pdf DOI : 10.1088/1742-6596/801/1/012067 |
|
学科分类:计算机科学(综合) | |
来源: IOP | |
【 摘 要 】
Mining proscess for Indonesian language still be an interesting research. Multiple of words representation was claimed can keep the meaning of text better than bag of words. In this paper, we compare several sequential pattern algortihm, among others BIDE (BIDirectional Extention), PrefixSpan, and TRuleGrowth. All of those algorithm produce frequent word sequence to keep the meaning of text. However, the experiment result, with 14.006 of Indonesian tweet from Twitter, shows that BIDE can produce more efficient frequent word sequence than PrefixSpan and TRuleGrowth without missing the meaning of text. Then, the average of time process of PrefixSpan is faster than BIDE and TRuleGrowth. In the other hand, PrefixSpan and TRuleGrowth is more efficient in using memory than BIDE.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Comparison between BIDE, PrefixSpan, and TRuleGrowth for Mining of Indonesian Text | 1694KB | download |