Towards Efficient Data Valuation Based on the Shapley Value (2019-02-27T00:00:00.000000Z)

TL;DR

This paper proposes a repertoire of efficient algorithms for approximating the Shapley value, a popular notion of value which originated in coopoerative game theory and demonstrates the value of each training instance for various benchmark datasets.

Abstract

{\em ``How much is my data worth?''} is an increasingly common question posed by organizations and individuals alike. An answer to this question could allow, for instance, fairly distributing profits among multiple data contributors and determining prospective compensation when data breaches happen. In this paper, we study the problem of \emph{data valuation} by utilizing the Shapley value, a popular notion of value which originated in coopoerative game theory. The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value. However, the Shapley value often requires \emph{exponential} time to compute. To meet this challenge, we propose a repertoire of efficient algorithms for approximating the Shapley value. We also demonstrate the value of each training instance for various benchmark datasets.

Authors

D. Song

18 papers

Boxin Wang

2 papers

Bo Li

2 papers

TL;DR

Abstract

Authors

References33 items

Finding Influential Training Samples for Gradient Boosted Decision Trees

A cooperative game-theoretic approach to quantify the value of personal data in networks

A Unified Approach to Interpreting Model Predictions

Understanding Black-box Predictions via Influence Functions

Towards Evaluating the Robustness of Neural Networks

Determination of an optimal feature selection method based on maximum Shapley value

Addressing the computational issues of the Shapley value with applications in the smart grid

Explaining and Harnessing Adversarial Examples

Parallel Feature Selection Inspired by Group Testing

Computational Analysis of Connectivity Games with Applications to the Investigation of Terrorist Networks

Bounding the Estimation Error of Sampling-based Shapley Value Approximation With/Without Stratifying

Safe Screening of Non-Support Vectors in Pathwise SVM Computation

Using cooperative game theory to optimize the feature selection problem

Game Theoretic Centrality Analysis of Terrorist Networks: The Cases of Jemaah Islamiyah and Al Qaeda

Theoretical Foundations and Numerical Methods for Sparse Recovery

A linear approximation method for the Shapley value

Approximating power indices

Differential Privacy: A Survey of Results

Sampling algorithms and coresets for ℓp regression

Feature Selection Based on the Shapley Value

Stable signal recovery from incomplete and inaccurate measurements

On the value of private information

Time-consistent Shapley value allocation of pollution cost reduction

On the Complexity of Cooperative Solution Concepts

Combinatorial Group Testing and Its Applications

A Value for n-person Games

Probability Inequalities for the Sum of Independent Random Variables

and M

A Novel Feature Selection Technique for Improved Survivability Diagnosis of Breast Cancer

Compressive Sensing with structured random matrices

Compressive Sensing and Structured Random Matrices

Stability and Generalization

Bargaining Foundations of Shapley Value

Field of Study

Journal Information

Name

Volume

Venue Information

Name

Type

URL

Alternate Names