Neural networks have become popular tools for many inference tasks nowadays. However, these networks are functions derived from their training data and thorough analysis of these networks reveals information about the training dataset. This could be dire in many scenarios such as network log anomaly classifiers leaking data about the network they were trained on, disease detectors revealing information about participants such as genomic markers, facial recognition classifiers revealing data about the faces it was trained on, just to name a few alarming cases. As different industries employ this technology with open arms, it would be wise to be aware of the privacy impacts of openly shared classifiers.To that measure, we perform the first study of property inference attacks specifically for deep neural networks - deriving properties of the training dataset with no information apart from the classifier parameters and architecture. We implement and compare different techniques to improve the effectiveness of the attack on deep neural networks. We show how different interpretations and representations of the same classifier - 1) after sorting and normalizing the vector representation, 2) as a graph and 3) as a group of sets - are able to increase leakage of data from the classifier, with the latter being the most effective. We compare effectiveness on a synthetic dataset, the US Census Dataset and the MNIST image recognition dataset, showing that critical properties such as training conditions or bias in the dataset can be derived by the attack.
【 预 览 】
附件列表
Files
Size
Format
View
Inferring properties of neural networks with intelligent designs