期刊论文详细信息
Applied Sciences
Towards a Universal Semantic Dictionary
Eszter Iklódi1  Gábor Borbély1  Gábor Recski1  MariaJose Castro-Bleda2 
[1] Budapest University of Technology and Economics, 1111 Budapest, Hungary;VRAIN Valencian Research Institute for Artificial Intelligence, Universitat Politècnica de València, 46022 Valencia, Spain;
关键词: natural language processing;    semantics;    word embeddings;    multilingual embeddings;    translation;    artificial neural networks;   
DOI  :  10.3390/app9194060
来源: DOAJ
【 摘 要 】

A novel method for finding linear mappings among word embeddings for several languages, taking as pivot a shared, multilingual embedding space, is proposed in this paper. Previous approaches learned translation matrices between two specific languages, while this method learns translation matrices between a given language and a shared, multilingual space. The system was first trained on bilingual, and later on multilingual corpora as well. In the first case, two different training data were applied: Dinu’s English−Italian benchmark data, and English−Italian translation pairs extracted from the PanLex database. In the second case, only the PanLex database was used. The system performs on English−Italian languages with the best setting significantly better than the baseline system given by Mikolov, and it provides a comparable performance with more sophisticated systems. Exploiting the richness of the PanLex database, the proposed method makes it possible to learn linear mappings among an arbitrary number of languages.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次