学位论文详细信息
Integrating Structure and Meaning: Using Holographic Reduced Representations to Improve Automatic Text Classification
Holographic Reduced Representations;Vector Space Model;Text Classification;Parts of Speech Tagging;Random Indexing;Support Vector Machines;Syntactic Structure;Semantics;System Design Engineering
Fishbein, Jonathan Michael
University of Waterloo
关键词: Holographic Reduced Representations;    Vector Space Model;    Text Classification;    Parts of Speech Tagging;    Random Indexing;    Support Vector Machines;    Syntactic Structure;    Semantics;    System Design Engineering;   
Others  :  https://uwspace.uwaterloo.ca/bitstream/10012/3819/1/thesis.pdf
瑞士|英语
来源: UWSPACE Waterloo Institutional Repository
PDF
【 摘 要 】

Current representation schemes for automatic text classification treat documents as syntactically unstructured collections of words (Bag-of-Words) or `concepts;; (Bag-of-Concepts).Past attempts to encode syntactic structure have treated part-of-speech information as another word-like feature, but have been shown to be less effective than non-structural approaches.We propose a new representation scheme using Holographic Reduced Representations (HRRs) as a technique to encode both semantic and syntactic structure, though in very different ways.This method is unique in the literature in that it encodes the structure across all features of the document vector while preserving text semantics.Our method does not increase the dimensionality of the document vectors, allowing for efficient computation and storage.We present the results of various Support Vector Machine classification experiments that demonstrate the superiority of this method over Bag-of-Concepts representations and improvement over Bag-of-Words in certain classification contexts.

【 预 览 】
附件列表
Files Size Format View
Integrating Structure and Meaning: Using Holographic Reduced Representations to Improve Automatic Text Classification 816KB PDF download
  文献评价指标  
  下载次数:14次 浏览次数:41次