学位论文详细信息
Opinion Topic, Holder and Polarity in texts: exploration and automatic identification from cross-lingual data
Opinion mining;Sentiment analysis;English andKorean;Opinion extraction
Kim, Kyoung-Young
关键词: Opinion mining;    Sentiment analysis;    English andKorean;    Opinion extraction;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/24482/Kim_Kyoung-Young.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

People express their opinions in various ways in different domains.With the growing interest in what other people think, mining opinions in texts has been the focus of attention for researchers in many different fields.Also, with the rapid development of technology and the internet, more and more multilingual and multicultural information has become available on the web.The objective of the present dissertation is exploring and automatically extracting opinions from multilingual corpora.In pursuing this objective, a bilingual opinion-annotated corpus was constructed focusing on detailed opinion factors with editorial texts.Annotated opinion factors include the holder of an opinion (Holder) and the topic of an opinion with its polarity (Positive Topic, Negative Topic).Factors used to express opinions as well as opinions across languages were investigated with the annotated corpus.The main contribution of this dissertation is the proposal of a multilingual sentiment analysis system for identifying opinion factors using a novel method that explores the linguistic structures used to express opinions.Without using pre-labeled opinion words, this multilingual sentiment analysis system directly identifies opinion factors using syntactic analysis, predicate-argument structure and pragmatic analysis.In the place of pre-labeled opinion words for each language, a clustered lexicon was constructed from bilingual dictionaries.Lexical features crucial for identifying the polarity were learned automatically.In addition to the lexical features, syntactic, morphological and contextual features were used in the learning algorithm.The syntactic structure of the sentence as well as predicate-argument structures extracted from the Propbank database were investigated and used to assign appropriate features to the target chunk.The experimental results show that the proposed system is significantly more successful than a baseline system.Experiments focusing on each novel method verify that both the clustered lexical dictionary and incorporating more linguistic structures benefit the accuracy of opinion factor extraction.The proposed system was also tested with an existing English monolingual corpus (MPQA corpus) composed of news articles, and yielded consistent results with the annotated corpus.With the experimental set-up of multilingual analysis, the way that opinions are expressed across languages was investigated and utilized to improve the results of the analysis.Experiments with cross-lingual features extracted from parallel sentences show even more improved results, which suggests cross-lingual reinforcement in identifying opinion factors with the proposed system.

【 预 览 】
附件列表
Files Size Format View
Opinion Topic, Holder and Polarity in texts: exploration and automatic identification from cross-lingual data 1329KB PDF download
  文献评价指标  
  下载次数:7次 浏览次数:18次