学位论文

【摘要】

Development of automatic text correction systems has a long history in natural language processing research. This thesis considers the problem of correcting writing mistakes made by non-native English speakers. We address several types of errors commonly exhibited by non-native English writers – misuse of articles, prepositions, noun number, and verb properties – and build a robust, state-of-the-art system that combines machine learning methods and linguistic knowledge.The proposed approach is distinguished from other related work in several respects. First,several machine learning methods are compared to determine which methods are most effective for this problem. Earlier evaluations, because they are based on incomparable data sets, have questionable conclusions. Our results reverse these conclusions and pave the way for the next contribution.Using the important observation that mistakes made by non-native writers are systematic, we develop models that utilize knowledge about error regularities with minimal annotation costs. Our approach differs from earlier ones that either built models that had no knowledge about error regularities or required a lot of annotated data.Next, we develop special strategies for correcting errors on open-class words. These errors, while being very prevalent among non-native English speakers, are the least studied and are not well-understood linguistically. The challenges that these mistakes present are addressed in a linguistically-informed approach.Finally, a novel global approach to error correction is proposed that considers grammatical dependencies among error types and addresses these via joint learning and joint inference. The systems and techniques described in this thesis are evaluated empirically and competitively in the context of several shared tasks, where they have demonstrated superior performance. In particular, our system ranked first in the most prestigious competition in the natural language processing field, the CoNLL-2013 shared task on text correction. Based on the analysis of this system, four design principles that are crucial for building a state-of-the-art error correction system are identified.

【预览】

附件列表
Files	Size	Format	View
Automated methods for text correction	867KB	PDF	download


Automated methods for text correction
text correction;grammatical error correction;English as a second language (ESL) error correction;automated methods for text correction
Rozovskaya, Alla
关键词: text correction; grammatical error correction; English as a second language (ESL) error correction; automated methods for text correction;
Others : https://www.ideals.illinois.edu/bitstream/handle/2142/46875/Alla_Rozovskaya.pdf?sequence=1&isAllowed=y
美国\|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF


	文献评价指标
	下载次数：3次	浏览次数：18次

【 摘 要 】

【 预 览 】

【摘要】

【预览】