期刊论文详细信息
NEUROCOMPUTING 卷:326
WordNet2Vec: Corpora agnostic word vectorization method
Article
Bartusiak, Roman1  Augustyniak, Lukasz1  Kajdanowicz, Tomasz1  Kazienko, Przemyslaw1  Piasecki, Maciej1 
[1] Wroclaw Univ Sci & Technol, Dept Computat Intelligence, Wroclaw, Poland
关键词: Natural language structuring;    WordNet;    WordNet2Vec;    Vectorization;    Network transformation;    Sentiment analysis;    Transfer learning;    Big data;    Complex networks;   
DOI  :  10.1016/j.neucom.2017.01.121
来源: Elsevier
PDF
【 摘 要 】

The complex nature of big data resources requires new structuring methods, especially for textual content. WordNet is a good knowledge source for the comprehensive abstraction of natural language as it offers good implementation for many languages. Since WordNet embeds natural language in the form of a complex network, a transformation mechanism, WordNet2Vec, is proposed in this paper. This creates vectors for each word from WordNet. These vectors encapsulate a general position - the role of a given word related to all other words in the given natural language. Any list or set of such vectors contains knowledge about the context of its components within the whole language. This type of word representation can be easily applied to many analytic tasks such as classification or clustering. The usefulness of the WordNet2Vec method is demonstrated in sentiment analysis including the classification of an Amazon opinion text dataset with transfer learning. (C) 2017 Elsevier B.V. All rights reserved.

【 授权许可】

Free   

【 预 览 】
附件列表
Files Size Format View
10_1016_j_neucom_2017_01_121.pdf 1917KB PDF download
  文献评价指标  
  下载次数:3次 浏览次数:0次