期刊论文

【摘要】

A novel method for finding linear mappings among word embeddings for several languages, taking as pivot a shared, multilingual embedding space, is proposed in this paper. Previous approaches learned translation matrices between two specific languages, while this method learns translation matrices between a given language and a shared, multilingual space. The system was first trained on bilingual, and later on multilingual corpora as well. In the first case, two different training data were applied: Dinu’s English−Italian benchmark data, and English−Italian translation pairs extracted from the PanLex database. In the second case, only the PanLex database was used. The system performs on English−Italian languages with the best setting significantly better than the baseline system given by Mikolov, and it provides a comparable performance with more sophisticated systems. Exploiting the richness of the PanLex database, the proposed method makes it possible to learn linear mappings among an arbitrary number of languages.

【授权许可】

Unknown

Applied Sciences
Towards a Universal Semantic Dictionary

Eszter Iklódi¹ Gábor Borbély¹ Gábor Recski¹ MariaJose Castro-Bleda²
[1] Budapest University of Technology and Economics, 1111 Budapest, Hungary;VRAIN Valencian Research Institute for Artificial Intelligence, Universitat Politècnica de València, 46022 Valencia, Spain;
关键词: natural language processing; semantics; word embeddings; multilingual embeddings; translation; artificial neural networks;
DOI : 10.3390/app9194060
来源: DOAJ


	文献评价指标
	下载次数：0次	浏览次数：0次

【 摘 要 】

【 授权许可】

【摘要】

【授权许可】