How to Combine Membership-Inference Attacks on Multiple Updated Models (2022-05-12T00:00:00.000000Z)

TL;DR

This paper develops the rst black-box MI attack algorithms that combine information from previously known standalone MI attacks to let the adversary take advantage of access to both the original model and one or more updated models to improve MI on the update set.

Abstract

A large body of research has shown that machine learning models are vulnerable to membership inference (MI) attacks that violate the privacy of the participants in the training data. Most MI research focuses on the case of a single standalone model, while production machine-learning platforms often update models over time, on data that often shifts in distribution, giving the attacker more information. This paper proposes new attacks that take advantage of one or more model updates to improve MI. A key part of our approach is to leverage rich information from standalone MI attacks mounted separately against the original and updated models, and to combine this information in specic ways to improve attack eectiveness. We propose a set of combination functions and tuning methods for each, and present both analytical and quantitative justication for various options. Our results on four public datasets show that our attacks are eective at using update information to give the adversary a signicant advantage over attacks on standalone models, but also compared to a prior MI attack that takes advantage of model updates in a related machine-unlearning setting. We perform the rst measurements of the impact of distribution shift on MI attacks with model updates, and show that a more drastic distribution shift results in signicantly higher MI risk than a gradual shift. Our code is available on GitHub. case, models are released are to mobile devices or all around the world predictions. This paper investigates the threat of repeated model updates for an attacker who monitors their releases and wishes to infer membership of specic samples in each update dataset. We formalize the problem of membership-inference under repeated model updates in a way that supports a wide range of model update procedures, sizes of update batches, and distribution shift in the new data (Section 3). Geared toward this problem, we develop the rst black-box MI attack algorithms that combine information from previously known standalone MI attacks—such as the state-of-the-art LiRA attack [5]—to let the adversary take advantage of access to both the original model and one or more updated models to improve MI on the update set (Section 4). Our algorithms compute the standalone attack’s condence scores separately against the original model, then against the update model(s), and combine them to obtain a condence score for membership in the update set. We justify the need to use detailed condence scores information by showing that combining only the binary membership decisions does not increase the attacker’s power. We consider two dierent methods for combining scores, each motivated analytically by the study of a simple example. Our analysis and experiments demonstrate that the best choice of score will depend on the specic learning algorithm being attacked. is study and We our on four datasets—FMNIST, CIFAR-10, Purchase100, and IMDb—using and neural We

References37 items

Membership Inference Attacks From First Principles

Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning

Extracting Training Data from Large Language Models

BREEDS: Benchmarks for Subpopulation Shift

Membership Leakage in Label-Only Exposures

Label-Only Membership Inference Attacks

Auditing Differentially Private Machine Learning: How Private is Private SGD?

Revisiting Membership Inference Under Realistic Assumptions

When Machine Unlearning Jeopardizes Privacy

Systematic Evaluation of Privacy Risks of Machine Learning Models

Analyzing Information Leakage of Updates to Natural Language Models

Eavesdrop the Composition Proportion of Training Labels in Federated Learning

MemGuard: Defending against Black-Box Membership Inference Attacks via Adversarial Examples

Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference

Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning

Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning

The total variation distance between high-dimensional Gaussians

Exploiting Unintended Feature Leakage in Collaborative Learning

The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks

Machine Learning with Membership Privacy using Adversarial Regularization

Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting

TFX: A TensorFlow-Based Production-Scale Machine Learning Platform

Membership Inference Attacks Against Machine Learning Models

Robust Traceability from Trace Amounts

Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds

Fingerprinting codes and the price of approximate differential privacy

Differentially Private Empirical Risk Minimization

Resolving Individuals Contributing Trace Amounts of DNA to Highly Complex Mixtures Using High-Density SNP Genotyping Microarrays