期刊论文详细信息
Electronics
Using Machine Learning to Detect Events on the Basis of Bengali and Banglish Facebook Posts
A. S. M. Sanwar Hosen1  Motahara Sabah Mredula2  Noyon Dey2  Md. Sazzadur Rahman2  In-Ho Ra3 
[1] Division of Computer Science and Engineering, Jeonbuk National University, Jeonju 54896, Korea;Institute of Information Technology, Jahangirnagar University, Dhaka 1342, Bangladesh;School of Computer, Information and Communication Engineering, Kunsan National University, Gunsan 54150, Korea;
关键词: Banglish;    Bengali;    Bernoulli Naïve Bayes;    decision tree;    event detection;    social media;   
DOI  :  10.3390/electronics10192367
来源: DOAJ
【 摘 要 】

In modern times, ensuring social security has become the prime concern for security administrators. The widespread and recurrent use of social media sites is creating a huge risk for the lives of the general people, as these sites are frequently becoming potential sources of the organization of various types of immoral events. For protecting society from these dangers, a prior detection system which can effectively detect events by analyzing these social media data is essential. However, automating the process of event detection has been difficult, as existing processes must account for diverse writing styles, languages, dialects, post lengths, and et cetera. To overcome these difficulties, we developed an effective model for detecting events, which, for our purposes, were classified as either protesting, celebrating, religious, or neutral, using Bengali and Banglish Facebook posts. At first, the collected posts’ text were processed for language detection, and then, detected posts were pre-processed using stopwords removal and tokenization. Features were then extracted from these pre-processed texts using three sub-processes: filtering, phrase matching of specific events, and sentiment analysis. The collected features were ultimately used to train our Bernoulli Naive Bayes classification model, which was capable of detecting events with 90.41% accuracy (for Bengali-language posts) and 70% (for the Banglish-form posts). For evaluating the effectiveness of our proposed model more precisely, we compared it with two other classifiers: Support Vector Machine and Decision Tree.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:1次