Research Connect
Research PapersAboutContact

Membership Inference Attacks on Machine Learning: A Survey

Published in ACM Computing Surveys (2021-03-14)
aionlincourseaionlincourseaionlincourseaionlincourseaionlincourseaionlincourse
Generate GraphDownload

On This Page

  • TL;DR
  • Abstract
  • Authors
  • Datasets
  • References
TL

TL;DR

This article provides the taxonomies for both attacks and defenses, based on their characterizations, and discusses their pros and cons, and point out several promising future research directions to inspire the researchers who wish to follow this area.

Abstract

Machine learning (ML) models have been widely applied to various applications, including image classification, text generation, audio recognition, and graph data analysis. However, recent studies have shown that ML models are vulnerable to membership inference attacks (MIAs), which aim to infer whether a data record was used to train a target model or not. MIAs on ML models can directly lead to a privacy breach. For example, via identifying the fact that a clinical record that has been used to train a model associated with a certain disease, an attacker can infer that the owner of the clinical record has the disease with a high chance. In recent years, MIAs have been shown to be effective on various ML models, e.g., classification models and generative models. Meanwhile, many defense methods have been proposed to mitigate MIAs. Although MIAs on ML models form a newly emerging and rapidly growing research area, there has been no systematic survey on this topic yet. In this article, we conduct the first comprehensive survey on membership inference attacks and defenses. We provide the taxonomies for both attacks and defenses, based on their characterizations, and discuss their pros and cons. Based on the limitations and gaps identified in this survey, we point out several promising future research directions to inspire the researchers who wish to follow this area. This survey not only serves as a reference for the research community but also provides a clear description for researchers outside this research domain. To further help the researchers, we have created an online resource repository, which we will keep updated with future relevant work. Interested readers can find the repository at https://github.com/HongshengHu/membership-inference-machine-learning-literature.

Authors

Hongsheng Hu

1 Paper

Z. Salcic

1 Paper

Lichao Sun

1 Paper

Datasets

Foursquare

MIMIC-III

The Medical Information Mart for Intensive Care III

Fashion-MNIST

References267 items

1

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

2

Semi-Supervised Classification with Graph Convolutional Networks

3

node2vec: Scalable Feature Learning for Networks

4

Deep Residual Learning for Image Recognition

5

Distributed Representations of Words and Phrases and their Compositionality

6

Research Impact

333

Citations

267

References

16

Datasets

6

G. Dobbie

1 Paper

P. Yu

1 Paper

Xuyun Zhang

1 Paper

BookCorpus

Pubmed

MNIST

ChestX-ray8

ChestX-ray8

CIFAR-10

Canadian Institute for Advanced Research, 10 classes

CelebA

CelebFaces Attributes Dataset

Cityscapes

Colored MNIST

SVHN

Street View House Numbers

CIFAR-100

RCV1

Reuters Corpus Volume 1

UTKFace

ImageNet

ImageNet: A large-scale hierarchical image database

7

Meta-Learning in Neural Networks: A Survey

8

Generative adversarial networks

9

Gradient-based learning applied to document recognition

10

Neural Collaborative Filtering

11

Deep Learning

12

A survey on deep learning in medical image analysis

13

Improved Baselines with Momentum Contrastive Learning

14

Momentum Contrast for Unsupervised Visual Representation Learning

15

mixup: Beyond Empirical Risk Minimization

16

ChestX-Ray8: Hospital-Scale Chest X-Ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases

17

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

18

Intriguing properties of neural networks

19

Extracting Training Data from Large Language Models

20

GAN-Leaks: A Taxonomy of Membership Inference Attacks against Generative Models

21

Monte Carlo and Reconstruction Membership Inference Attacks against Generative Models

22

GANobfuscator: Mitigating Information Leakage Under GAN via Differential Privacy

23

Demystifying Membership Inference Attacks in Machine Learning as a Service

24

Differentially Private Data Generative Models

25

Exploiting Unintended Feature Leakage in Collaborative Learning

26

Differentially Private Generative Adversarial Network

27

Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing

28

Attention is All you Need

29

LOGAN: Membership Inference Attacks Against Generative Models

30

Generating Multi-label Discrete Patient Records using Generative Adversarial Networks

31

Membership Inference Attacks Against Machine Learning Models

Computer ScienceMathematics
32

Auto-Encoding Variational Bayes

33

ar X iv : 1 80 1 . 01 59 4 v 2 [ cs . C R ] 2 5 M ar 2 01 8 Differentially Private Releasing via Deep Generative Model ( Technical Report )

34

Learning Multiple Layers of Features from Tiny Images

35

Foundations of Machine Learning

36

GloVe: Global Vectors for Word Representation

37

On the Difficulties of Disclosure Prevention in Statistical Databases or The Case for Differential Privacy

38

Calibrating Noise to Sensitivity in Private Data Analysis

39

Evaluating Differentially Private Machine Learning in Practice

40

Deep Learning with Differential Privacy

41

Dropout: a simple way to prevent neural networks from overfitting

42

Explaining and Harnessing Adversarial Examples

43

Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases

44

Rethinking the Inception Architecture for Computer Vision

45

Understanding deep learning (still) requires rethinking generalization

Computer Science
46

Understanding deep learning requires rethinking generalization

47

Distilling the Knowledge in a Neural Network

48

The Cityscapes Dataset for Semantic Urban Scene Understanding

49

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

50

Train faster, generalize better: Stability of stochastic gradient descent

51

Progressive Growing of GANs for Improved Quality, Stability, and Variation

52

Deep Learning Face Attributes in the Wild

53

Reading Digits in Natural Images with Unsupervised Feature Learning

54

The human splicing code reveals new insights into the genetic determinants of disease

55

Communication-Efficient Learning of Deep Networks from Decentralized Data

56

Bayesian Learning via Stochastic Gradient Langevin Dynamics

57

Collective Classification in Network Data

58

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

59

MIMIC-III, a freely accessible critical care database

60

Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures

61

Do Deep Nets Really Need to be Deep?

62

Towards Deep Learning Models Resistant to Adversarial Attacks

63

Hidden factors and hidden topics: understanding rating dimensions with review text

64

Age Progression/Regression by Conditional Adversarial Autoencoder

65

The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes

66

Multi-class texture analysis in colorectal cancer histology

67

Deep learning for healthcare: review, opportunities and challenges

68

Machine Learning Models that Remember Too Much

69

Information Theory and Statistics

70

The mnist database of handwritten digits

71

General Data Protection Regulation

72

WASSA-2017 Shared Task on Emotion Intensity

73

Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments

74

Stealing Machine Learning Models via Prediction APIs

Computer ScienceMathematics
75

Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition

76

FedMD: Heterogenous Federated Learning via Model Distillation

77

Federated Learning: Challenges, Methods, and Future Directions

78

Differentially Private Federated Learning: A Client Level Perspective

79

Confusion Matrix

80

Practical Membership Inference Attack Against Collaborative Inference in Industrial IoT

81

Adapting Membership Inference Attacks to GNN for Graph Classification: Approaches and Implications

82

Mitigating Membership Inference Attacks by Self-Distillation Through a Novel Ensemble Architecture

83

Digestive neural networks: A novel defense strategy against inference attacks in federated learning

84

Membership Inference Attacks Against Recommender Systems

85

Source Inference Attacks in Federated Learning

86

Evaluating the Vulnerability of End-to-End Automatic Speech Recognition Models to Membership Inference Attacks

87

EncoderMI: Membership Inference against Pre-trained Encoders in Contrastive Learning

88

PAR-GAN: Improving the Generalization of Generative Adversarial Networks Against Membership Inference Attacks

89

Defending Privacy Against More Knowledgeable Membership Inference Attackers

90

Membership Inference Attacks on Lottery Ticket Networks

91

EAR: An Enhanced Adversarial Regularization Approach against Membership Inference Attacks

92

This Person (Probably) Exists. Identity Membership Attacks Against GAN Generated Faces

93

Trustworthy AI: A Computational Perspective

94

A Comprehensive Survey of Privacy-preserving Federated Learning

95

Deep learning for AI

96

Membership Inference on Word Embedding and Beyond

97

A Survey of Unsupervised Generative Models for Exploratory Data Analysis and Representation Learning

98

On the Difficulty of Membership Inference Attacks

99

Adversarial Machine Learning Attacks and Defense Methods in the Cyber Security Domain

100

Membership Privacy for Machine Learning Models Through Knowledge Transfer

101

Membership Inference Attacks on Deep Regression Models for Neuroimaging

102

privGAN: Protecting GANs from membership inference attacks at low cost to utility

103

Membership Inference Attack Susceptibility of Clinical Language Models

104

Membership Inference Attacks on Knowledge Graphs

105

On the (In)Feasibility of Attribute Inference Attacks on Machine Learning Models

106

On the privacy-utility trade-off in differentially private hierarchical text classification

107

Defending Medical Image Diagnostics against Privacy Attacks using Generative Methods

108

A Taxonomy of Attacks on Federated Learning

109

Node-Level Membership Inference Attacks Against Graph Neural Networks

110

ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models

111

Membership Inference Attack on Graph Neural Networks

112

Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning

113

Practical Blind Membership Inference Attack via Differential Comparisons

114

Membership Inference Attack with Multi-Grade Service Models in Edge Intelligence

115

When Machine Learning Meets Privacy

116

Privacy-Preserving in Defending against Membership Inference Attacks

117

On the Privacy Risks of Algorithmic Fairness

118

FaceLeaks: Inference Attacks against Transfer Learning Models via Black-box Queries

119

Differentially Private Learning Does Not Bound Membership Inference

120

Quantifying Membership Privacy via Information Leakage

121

HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients

122

Quantifying Privacy Leakage in Graph Embedding

123

GECKO: Reconciling Privacy, Accuracy and Efficiency in Embedded Deep Learning

124

An Extension of Fano's Inequality for Characterizing Model Susceptibility to Membership Inference Attacks

125

Quantifying Membership Inference Vulnerability via Generalization Gap and Other Model Metrics

126

Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

127

Toward Robustness and Privacy in Federated Learning: Experimenting with Local and Central Differential Privacy

128

Investigating the Impact of Pre-trained Word Embeddings on Memorization in Neural Networks

129

A Comprehensive Analysis of Information Leakage in Deep Transfer Learning

130

Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries

131

A Pragmatic Approach to Membership Inferences on Machine Learning Models

132

Against Membership Inference Attack: Pruning is All You Need

133

Differential Privacy Protection Against Membership Inference Attack on Machine Learning for Genomic Data

134

Beyond Model-Level Membership Privacy Leakage: an Adversarial Approach in Federated Learning

135

Membership Leakage in Label-Only Exposures

136

Label-Leaks: Membership Inference Attack with Label

137

Label-Only Membership Inference Attacks

138

How Does Data Augmentation Affect Privacy in Machine Learning?

139

ML Privacy Meter: Aiding Regulatory Compliance by Quantifying the Privacy Risks of Machine Learning

140

A Survey of Privacy Attacks in Machine Learning

141

Auditing Differentially Private Machine Learning: How Private is Private SGD?

142

Adversarial Examples on Object Recognition

143

On the Effectiveness of Regularization Against Membership Inference Attacks

144

GAN Enhanced Membership Inference: A Passive Local Attack in Federated Learning

145

Revisiting Membership Inference Under Realistic Assumptions

146

An Overview of Privacy in Machine Learning

147

Characteristic Functions on Graphs: Birds of a Feather, from Statistical Descriptors to Parametric Models

148

A Secure Federated Learning Framework for 5G Networks

149

Defending Model Inversion and Membership Inference Attacks via Prediction Purification

150

When Machine Unlearning Jeopardizes Privacy

151

Diabetic Retinopathy Detection

152

Privacy in Deep Learning: A Survey

153

Racism and discrimination in COVID-19 responses

154

Information Leakage in Embedding Models

155

Systematic Evaluation of Privacy Risks of Machine Learning Models

156

Threats to Federated Learning: A Survey

157

Membership Inference Attacks and Defenses in Classification Models

158

Membership Inference Attacks and Defenses in Supervised Learning via Generalization Gap

159

Data and Model Dependencies of Membership Inference Attack

160

Modelling and Quantifying Membership Information Leakage in Machine Learning

161

Privacy for All: Demystify Vulnerability Disparity of Differential Privacy against Membership Inference Attack

162

privGAN: Protecting GANs from membership inference attacks at low cost

163

Cronus: Robust and Heterogeneous Collaborative Learning with Black-Box Knowledge Transfer

164

Segmentations-Leak: Membership Inference Attacks and Defenses in Semantic Image Segmentation

165

Federated Learning

166

Machine Unlearning

Computer Science
167

Effects of Differential Privacy and Data Skewness on Membership Inference Vulnerability

168

Demystifying the Membership Inference Attack

169

A taxonomy and terminology of adversarial machine learning

170

Characterizing Membership Privacy in Stochastic Gradient Langevin Dynamics

171

Alleviating Privacy Attacks via Causal Learning

172

MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples

173

Accident Risk Prediction based on Heterogeneous Sparse Data: New Dataset and Insights

174

On Inferring Training Data Attributes in Machine Learning Models

175

Generalization in Generative Adversarial Networks: A Novel Perspective from Privacy Protection

176

Invariant Risk Minimization

177

On the Privacy Risks of Model Explanations

178

Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference

179

Generating Private Data Surrogates for Vision Related Tasks

180

SocInf: Membership Inference Attacks on Social Media Health Data With Machine Learning

181

Disparate Vulnerability: on the Unfairness of Privacy Attacks Against Machine Learning

182

ML Defense: Against Prediction API Threats in Cloud-Based Machine Learning Service

183

White-box vs Black-box: Bayes Optimal Strategies for Membership Inference

184

Privacy Risks of Securing Machine Learning Models against Adversarial Examples

185

Membership Inference Attacks Against Adversarially Robust Deep Learning Models

186

The Audio Auditor: User-Level Membership Inference in Internet of Things Voice Services

187

Location Embeddings for Next Trip Recommendation

188

Membership Inference Attacks on Sequence-to-Sequence Models: Is My Data In Your Machine Translation System?

189

Measuring Membership Privacy on Aggregate Location Time-Series

190

Adversarial Attack and Defense on Graph Data: A Survey

191

Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning

192

Auditing Data Provenance in Text-Generation Models

193

Findings of the 2018 Conference on Machine Translation (WMT18)

194

Property Inference Attacks on Fully Connected Neural Networks using Permutation Invariant Representations

195

MLCapsule: Guarded Offline Deployment of Machine Learning as a Service

196

Algorithms that remember: model inversion attacks and data protection law

197

Privacy-preserving Machine Learning through Data Obfuscation

198

Killing Four Birds with one Gaussian Process: The Relation between different Test-Time Attacks

199

ML-Leaks: Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models

200

Performing Co-membership Attacks Against Deep Generative Models

201

BDD100K: A Diverse Driving Video Database with Scalable Annotation Tooling

202

Extreme Adaptation for Personalized Neural Machine Translation

203

Generating Artificial Data for Private Deep Learning

204

The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks

205

Understanding Membership Inferences on Well-Generalized Learning Models

206

Certified Robustness to Adversarial Examples with Differential Privacy

207

Machine Learning with Membership Privacy using Adversarial Regularization

208

Differentially Private Releasing via Deep Generative Model

209

Towards Measuring Membership Privacy

210

Moonshine: Distilling with Cheap Convolutions

211

Learning Differentially Private Recurrent Language Models

212

Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting

Computer Science
213

walk2friends: Inferring Social Links from Mobility Profiles

214

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

215

Knock Knock, Who's There? Membership Inference on Aggregate Location Data

216

Improved Training of Wasserstein GANs

217

BEGAN: Boundary Equilibrium Generative Adversarial Networks

218

Rényi Differential Privacy

219

Towards the Science of Security and Privacy in Machine Learning

220

Adversarial Machine Learning at Scale

221

Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

222

Enriching Word Vectors with Subword Information

223

Smart Reply: Automated Response Suggestion for Email

224

Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds

225

Participatory Cultural Mapping Based on Collective Behavior Data in Location-Based Social Networks

226

Autoencoding beyond pixels using a learned similarity metric

227

Federated Optimization: Distributed Optimization Beyond the Datacenter

228

Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books

229

A data-driven approach to cleaning large face datasets

230

Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing

231

CLiPS Stylometry Investigation (CSI) corpus: A Dutch corpus for the detection of age, gender, personality, sentiment and deception in text

232

Impact of HbA1c Measurement on Hospital Readmission Rates: Analysis of 70,000 Clinical Database Patient Records

233

Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers

234

Chameleons in Imagined Conversations: A New Approach to Understanding Coordination of Linguistic Style in Dialogs

235

Caltech-UCSD Birds 200

236

Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays

237

Differential Privacy: A Survey of Results

238

Acquiring linear subspaces for face recognition under variable lighting

239

RCV1: A New Benchmark Collection for Text Categorization Research

240

Unsupervised Learning: Foundations of Neural Computation

241

Monte Carlo Statistical Methods

242

NewsWeeder: Learning to Filter Netnews

243

Principles of Risk Minimization for Learning Theory

244

When Does Data Augmentation Help With Membership Inference Attacks?

245

Accuracy-Privacy Trade-off in Deep Ensembles

246

Comparing Local and Central Differential Privacy Using Membership Inference Attacks

247

Resisting membership inference attacks through knowledge distillation

248

Reconstruction-Based Membership Inference Attacks are Easier on Difficult Problems

249

GAN-Leaks: A Taxonomy ofMembership Inference Attacks against Generative Models. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security (Virtual Event, USA) (CCS ’20)

250

Exploiting Transparency Measures for Membership Inference: a Cautionary Tale

251

Towards the Infeasibility of Membership

252

Membership Inference Attack against Differentially Private Deep Learning Model

253

Philipp Koehn, and Christof Monz

254

Reddit comments dataset. https://bigquery.cloud.google.com/dataset/fh-bigquery:redditcomments

255

Weibo content corpus

256

Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

257

Acquire Valued Shoppers Challenge

258

Large text compression benchmark. https://cs.fit.edu/~mmahoney/compression/text.html

259

Texas Hospital Inpatient Discharge Public Use Data File

260

Texas Health Care Information Collection Center

261

Artificial intelligence: a modern approach

262

Online algorithms and stochastic approximations

263

Long Short-TermMemory

264

Pattern recognition

265

Convergence de la répartition empirique vers la répartition théorique

266

Neural network based attacks

267

Black-box attacks

Authors

Field of Study

Computer Science

Journal Information

Name

ACM Computing Surveys (CSUR)

Volume

54

Venue Information

Name

ACM Computing Surveys

Type

journal

URL

http://www.acm.org/pubs/surveys/

Alternate Names

  • ACM Comput Surv