期刊论文详细信息
Acoustical science and technology
Generative adversarial networks: Foundations and applications
Takuhiro Kaneko1 
[1] NTT Communication Science Laboratories, NTT Corporation
关键词: Generative adversarial networks;    Deep generative models;    Image generation;    Speech synthesis;    Voice conversion;   
DOI  :  10.1250/ast.39.189
学科分类:声学和超声波
来源: Acoustical Society of Japan
PDF
【 摘 要 】

In statistical signal processing and machine learning, an open issue has been how to obtain a generative model that can produce samples from high-dimensional data distributions such as images and speeches. Generative adversarial networks (GANs) have emerged as a powerful framework that provides clues to solving this problem. A GAN is composed of two networks: a generator that transforms noise variables to data space and a discriminator that discriminates real and generated data. These two networks are optimized using a min-max game: the generator attempts to deceive the discriminator by generating data indistinguishable from the real data, while the discriminator attempts not to be deceived by the generator by finding the best discrimination between real and generated data. This novel framework enables the implicit estimation of a data distribution and enables the generator to generate high-fidelity data that are almost indistinguishable from real data. This beneficial and powerful property has attracted a great deal of attention, and a wide range of research, from basic research to practical applications, has been recently conducted. In this paper, I summarize these studies and explain the foundations and applications of GANs. Specifically, I first clarify the relation between GANs and other deep generative models then provide the theory of GANs with numerical formula. Next, I introduce recent advances in GANs and describe the impressive applications that are highly related to acoustic and speech signal processing. Finally, I conclude this paper by mentioning future directions.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO201910185063577ZK.pdf 342KB PDF download
  文献评价指标  
  下载次数:16次 浏览次数:7次