期刊论文详细信息
NEUROCOMPUTING 卷:175
Squeezing bottlenecks: Exploring the limits of autoencoder semantic representation capabilities
Article; Proceedings Paper
Gupta, Parth1  Banchs, Rafael E.2  Rosso, Paolo1 
[1] Univ Politecn Valencia, PRHLT Res Ctr, E-46022 Valencia, Spain
[2] Inst Infocomm Res, Singapore, Singapore
关键词: Text representation;    Deep autoencoder;   
DOI  :  10.1016/j.neucom.2015.06.091
来源: Elsevier
PDF
【 摘 要 】

We present a comprehensive study on the use of autoencoders for modelling text data, in which (differently from previous studies) we focus our attention on the various issues. We explore the suitability of two different models binary deep autencoders (bDA) and replicated-softmax deep autencoders (rsDA) for constructing deep autoencoders for text data at the sentence level. We propose and evaluate two novel metrics for better assessing the text-reconstruction capabilities of autoencoders. We propose an automatic method to find the critical bottleneck dimensionality for text representations (below which structural information is lost); and finally we conduct a comparative evaluation across different languages, exploring the regions of critical bottleneck dimensionality and its relationship to language perplexity. (C) 2015 Elsevier B.V. All rights reserved.

【 授权许可】

Free   

【 预 览 】
附件列表
Files Size Format View
10_1016_j_neucom_2015_06_091.pdf 1151KB PDF download
  文献评价指标  
  下载次数:2次 浏览次数:0次