学位论文详细信息
Detecting protest repression incidents from tweets
T Technology (General)
Elsafoury, Fatma ; Jensen, Bjorn Sand
University:University of Glasgow
Department:School of Computing Science
关键词: Protests, Violence, Protest repression, Twitter, Machine learning, Text classification, Support vector machine (SVM), Naive Bayes (NB), Crowdsourcing, Figure-Eight.;   
Others  :  http://theses.gla.ac.uk/75160/1/2019ElsafouryMScR.pdf
来源: University of Glasgow
PDF
【 摘 要 】

Protests are considered a threat to governments and political elites, that is why protesters are likely to be faced with repression. For social scientists to study protest repression, they need protest repression datasets. Currently, social scientists depend on news reports to build protest datasets and political conflict datasets. Although news reports provide a source of information that gives access to historical and international events, they have limitations like the coverage of small protest events and the delay in reporting incidents. This research explores the use of social media posts, especially Twitter, to build protest repression dataset and to overcome the limitations of using new reports. We use supervised machine learning models with a dataset of tweets that were sent during the Turkish Gezi Park protest in 2013 to detect tweets that report protest repression events. To accomplish this, we run a crowdsourcing experiment to build a training dataset of tweets and their corresponding labels as protest-related or not and violent or not. Then, we use this dataset to train two baseline machine learning models: Support Vector Machine(SVM) and Multinomial Naive Bayes(MNB) with different text representation models: Bag of Words(BOW), TF-IDF and word Embedding(WE). The empirical results of the experiments show that Crowdsourcing with the right settings and quality measures provides a fast and cheap way to hand label datasets to train machine learning models. The results also show that baseline machine learning models perform well in tweets classification tasks in terms of good AUC scores (high true positive rate and low false-positive rate).

【 预 览 】
附件列表
Files Size Format View
Detecting protest repression incidents from tweets 3279KB PDF download
  文献评价指标  
  下载次数:7次 浏览次数:10次