会议论文详细信息
2017 International Conference on Control Engineering and Artificial Intelligence
Proposal: A Hybrid Dictionary Modelling Approach for Malay Tweet Normalization
计算机科学
Binti Muhamad, Nor Azlizawati^1 ; Idris, Norisma^1 ; Saloot, Mohammad Arshi^1
University of Malaya, Kuala Lumpur
50603, Malaysia^1
关键词: Dictionary modelling;    Language model;    Malay languages;    N-grams;   
Others  :  https://iopscience.iop.org/article/10.1088/1742-6596/806/1/012008/pdf
DOI  :  10.1088/1742-6596/806/1/012008
学科分类:计算机科学(综合)
来源: IOP
PDF
【 摘 要 】

Malay Twitter message presents a special deviation from the original language. Malay Tweet widely used currently by Twitter users, especially at Malaya archipelago. Thus, it is important to make a normalization system which can translated Malay Tweet language into the standard Malay language. Some researchers have conducted in natural language processing which mainly focuses on normalizing English Twitter messages, while few studies have been done for normalize Malay Tweets. This paper proposes an approach to normalize Malay Twitter messages based on hybrid dictionary modelling methods. This approach normalizes noisy Malay twitter messages such as colloquially language, novel words, and interjections into standard Malay language. This research will be used Language Model and N-grams model.

【 预 览 】
附件列表
Files Size Format View
Proposal: A Hybrid Dictionary Modelling Approach for Malay Tweet Normalization 744KB PDF download
  文献评价指标  
  下载次数:3次 浏览次数:14次