期刊论文详细信息
Mathematical statistics and learning
Adversarial examples in random neural networks with general activations
article
Andrea Montanari1  Yuchen Wu1 
[1] Stanford University
关键词: Adversarial example;    neural network;   
DOI  :  10.4171/msl/41
来源: European Mathematical Society
PDF
【 摘 要 】

A substantial body of empirical work documents the lack of robustness in deep learning models to adversarial examples. Recent theoretical work proved that adversarial examples are ubiquitous in two-layers networks with sub-exponential width and ReLU or smooth activations, and multi-layer ReLU networks with sub-exponential width. We present a result of the same type, with no restriction on width and for general locally Lipschitz continuous activations. More precisely, given a neural network f( ⋅ ;θ)f(\,\cdot\,;\mathbf{\theta})f(⋅;θ) with random weights θ\mathbf{\theta}θ, and feature vector x\mathbf{x}x, we show that an adversarial example x′\mathbf{x}'x′ can be found with high probability along the direction of the gradient ∇xf(x;θ)\nabla_{\mathbf{x}}f(\mathbf{x};\mathbf{\theta})∇x​f(x;θ). Our proof is based on a Gaussian conditioning technique. Instead of proving that fff is approximately linear in a neighborhood of x\mathbf{x}x, we characterize the joint distribution of f(x;θ)f(\mathbf{x};\mathbf{\theta})f(x;θ) and f(x′;θ)f(\mathbf{x}';\mathbf{\theta})f(x′;θ) for x′=x−s(x)∇xf(x;θ)\mathbf{x}' = \mathbf{x}-s(\mathbf{x})\nabla_{\mathbf{x}}f(\mathbf{x};\mathbf{\theta})x′=x−s(x)∇x​f(x;θ), where s(x)=sign⁡(f(x;θ))⋅sds(\mathbf{x}) = \operatorname{sign}(f(\mathbf{x}; \mathbf{\theta})) \cdot s_ds(x)=sign(f(x;θ))⋅sd​ for some positive step size sds_dsd​.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO202307150000700ZK.pdf 586KB PDF download
  文献评价指标  
  下载次数:1次 浏览次数:0次