会议论文详细信息
2017 1st International Conference on Engineering and Applied Technology
Named entity recognition model for Indonesian tweet using CRF classifier
Munarko, Y.^1 ; Sutrisno, M.S.^1 ; Mahardika, W.A.I.^1 ; Nuryasin, I.^1 ; Azhar, Y.^1
Teknik Informatika, Universitas Muhammadiyah Malang, Indonesia^1
关键词: Conditional random field;    Data-sources;    Indonesians;    Named entity recognition;    Recall and precision;    Social media;    Test data;    Training data;   
Others  :  https://iopscience.iop.org/article/10.1088/1757-899X/403/1/012067/pdf
DOI  :  10.1088/1757-899X/403/1/012067
来源: IOP
PDF
【 摘 要 】
Named Entity Recognition (NER) is a part of Natural Language Processing (NLP) that acts to recognize the existing word entity in the document. By using NER, it is possible to perform activities such as information extraction and text summary. One of the data sources for the NLP process is tweets which are real time, occurred frequently, but limited by the number of words per tweet. In Indonesia, twitter is one of the most popular social media with various topics, so, it is necessary to provide models, train data, and test data for Indonesian tweet. In this study, the models were built using Conditional Random Field classification from 8,000 tweets that have been grouped to formal tweets and informal tweets. By testing the models to 2,000 training data, it provided recall and precision results of 62% and 87% respectively for formal tweets, 36% and 90% respectively for informal tweets, and 60% and 86% respectively for mixed tweets. These results indicate that the created Indonesian tweet models can be used for automatic NER.
【 预 览 】
附件列表
Files Size Format View
Named entity recognition model for Indonesian tweet using CRF classifier 480KB PDF download
  文献评价指标  
  下载次数:10次 浏览次数:18次