期刊论文详细信息
The international arab journal of information technology
New Language Models for Spelling Correction
article
Saida Laaroussi1  Si Lhoussain Aouragh2  Abdellah Yousfi3  Mohamed Nejja4  Hicham Geddah5  Said Ouatik El Alaoui1 
[1] IT, Logistics and Mathematics, Ibn Tofail University;IT and Decision Support System, Mohamed V University;Department of Economics and Management, Mohamed V University;Department of Software Engineering, Mohamed V University;Department of Computer Science, Mohamed V University
关键词: Spelling correction;    contextual correction;    n-gram language models;    edit distance;    NLP;   
DOI  :  10.34028/iajit/19/6/12
学科分类:计算机科学(综合)
来源: Zarqa University
PDF
【 摘 要 】

Correcting spelling errors based on the context is a fairly significant problem in Natural Language Processing(NLP) applications. The majority of the work carried out to introduce the context into the process of spelling correction usesthe n-gram language models. However, these models fail in several cases to give adequate probabilities for the suggestedsolutions of a misspelled word in a given context. To resolve this issue, we propose two new language models inspired bystochastic language models combined with edit distance. A first phase consists in finding the words of the lexiconorthographically close to the erroneous word and a second phase consists in ranking and limiting these suggestions. We haveapplied the new approach to Arabic language taking into account its specificity of having strong contextual connectionsbetween distant words in a sentence. To evaluate our approach, we have developed textual data processing applications,namely the extraction of distant transition dictionaries. The correction accuracy obtained exceeds 98% for the first 10suggestions. Our approach has the advantage of simplifying the parameters to be estimated with a higher correction accuracycompared to n-gram language models. Hence the need to use such an approach.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO202307090002558ZK.pdf 569KB PDF download
  文献评价指标  
  下载次数:6次 浏览次数:1次