会议论文详细信息
International Conference on Computing and Applied Informatics 2016
Set of Frequent Word Item sets as Feature Representation for Text with Indonesian Slang
物理学;计算机科学
Sa'Adillah Maylawati, Dian^1 ; Putri Saptawati, G.A.^2
Departement of Informatics, State Islamic University of Bandung, Indonesia^1
Informatics, Bandung Institute of Technology, Indonesia^2
关键词: Feature representation;    FP-growth algorithm;    Indonesians;    Item sets;    Social media;    Text data;    Text mining;    Text representation;   
Others  :  https://iopscience.iop.org/article/10.1088/1742-6596/801/1/012066/pdf
DOI  :  10.1088/1742-6596/801/1/012066
学科分类:计算机科学(综合)
来源: IOP
PDF
【 摘 要 】

Indonesian slang are commonly used in social media. Due to their unstructured syntax, it is difficult to extract their features based on Indonesian grammar for text mining. To do so, we propose Set of Frequent Word Item sets (SFWI) as text representation which is considered match for Indonesian slang. Besides, SFWI is able to keep the meaning of Indonesian slang with regard to the order of appearance sentence. We use FP-Growth algorithm with adding separation sentence function into the algorithm to extract the feature of SFWI. The experiments is done with text data from social media such as Facebook, Twitter, and personal website. The result of experiments shows that Indonesian slang were more correctly interpreted based on SFWI.

【 预 览 】
附件列表
Files Size Format View
Set of Frequent Word Item sets as Feature Representation for Text with Indonesian Slang 1581KB PDF download
  文献评价指标  
  下载次数:15次 浏览次数:37次