2017 International Symposium on Application of Materials Science and Energy Materials | |
Natural-Annotation-based Unsupervised Construction of Korean-Chinese Domain Dictionary | |
材料科学;能源学 | |
Liu, Wuying^1 ; Wang, Lin^2 | |
Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou, Guangdong | |
510420, China^1 | |
Xianda College of Economics and Humanities, Shanghai International Studies University, Shanghai | |
200083, China^2 | |
关键词: Automatic construction; Chinese characters; Construction method; Statistical learning; Word-pairs; | |
Others : https://iopscience.iop.org/article/10.1088/1757-899X/322/5/052054/pdf DOI : 10.1088/1757-899X/322/5/052054 |
|
学科分类:材料科学(综合) | |
来源: IOP | |
【 摘 要 】
The large-scale bilingual parallel resource is significant to statistical learning and deep learning in natural language processing. This paper addresses the automatic construction issue of the Korean-Chinese domain dictionary, and presents a novel unsupervised construction method based on the natural annotation in the raw corpus. We firstly extract all Korean-Chinese word pairs from Korean texts according to natural annotations, secondly transform the traditional Chinese characters into the simplified ones, and finally distill out a bilingual domain dictionary after retrieving the simplified Chinese words in an extra Chinese domain dictionary. The experimental results show that our method can automatically build multiple Korean-Chinese domain dictionaries efficiently.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Natural-Annotation-based Unsupervised Construction of Korean-Chinese Domain Dictionary | 446KB | download |