Empirical observation of negligible fairness–accuracy trade-offs in machine learning for public policy (2020-12-05T00:00:00.000000Z)

TL;DR

An empirical study explores the trade-offs between increasing fairness and model accuracy across several social policy areas and finds that this trade-off is negligible in practice.

Abstract

The growing use of machine learning in policy and social impact settings has raised concerns over fairness implications, especially for racial minorities. These concerns have generated considerable interest among machine learning and artificial intelligence researchers, who have developed new methods and established theoretical bounds for improving fairness, focusing on the source data, regularization and model training, or post-hoc adjustments to model scores. However, few studies have examined the practical trade-offs between fairness and accuracy in real-world settings to understand how these bounds and methods translate into policy choices and impact on society. Our empirical study fills this gap by investigating the impact of mitigating disparities on accuracy, focusing on the common context of using machine learning to inform benefit allocation in resource-constrained programmes across education, mental health, criminal justice and housing safety. Here we describe applied work in which we find fairness–accuracy trade-offs to be negligible in practice. In each setting studied, explicitly focusing on achieving equity and using our proposed post-hoc disparity mitigation methods, fairness was substantially improved without sacrificing accuracy. This observation was robust across policy contexts studied, scale of resources available for intervention, time and the relative size of the protected groups. These empirical results challenge a commonly held assumption that reducing disparities requires either accepting an appreciable drop in accuracy or the development of novel, complex methods, making reducing disparities in these applications more practical. A growing number of researchers are developing approaches to improve fairness in machine learning applications in areas such as healthcare, employment and social services, to avoid propagating and amplifying racial and other inequities. An empirical study explores the trade-off between increasing fairness and model accuracy across several social policy areas and finds that this trade-off is negligible in practice.

References51 items

Predictive Analytics for Retention in Care in an Urban HIV Clinic

Case study: predictive fairness to reduce misdemeanor recidivism through social service interventions

Dissecting racial bias in an algorithm used to manage the health of populations

Using machine learning to help vulnerable tenants in New York city

Mitigating bias in algorithmic hiring: evaluating claims and practices

Racial Equity in Algorithmic Criminal Justice

An Empirical Study of Rich Subgroup Fairness for Machine Learning

Reducing Incarceration through Prioritized Interventions

Classification with Fairness Constraints: A Meta-Algorithm with Provable Guarantees

Fairness Behind a Veil of Ignorance: A Welfare Analysis for Automated Decision Making

Why Is My Classifier Discriminatory?

Fairness Definitions Explained

A comparative study of fairness-enhancing interventions in machine learning

The cost of fairness in binary classification

A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions

On formalizing fairness in prediction with machine learning

Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure

Decoupled Classifiers for Group-Fair and Efficient Machine Learning

Beyond prediction: Using big data for policy problems

RISK, RACE, AND RECIDIVISM: PREDICTIVE BIAS AND DISPARATE IMPACT*: RISK, RACE, AND RECIDIVISM

Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment

Fair Prediction with Disparate Impact: A Study of Bias in Recidivism Prediction Instruments

Equality of Opportunity in Supervised Learning

Inherent Trade-Offs in the Fair Determination of Risk Scores

Perceptions of Physical Inspections as a Tool to Protect Housing Quality and Promote Health Equity

Crowdsourcing City Government: Using Tournaments to Improve Inspection Accuracy

A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes

Predictive Modeling for Public Health: Preventing Childhood Lead Poisoning

Fairness Constraints: Mechanisms for Fair Classification

Affordable housing and health: a health impact assessment on physical inspection frequency.

Who, when, and why: a machine learning approach to prioritizing students at risk of not graduating high school on time

Supplementary Information

Do We Know Who Will Drop Out?: A Review of the Predictors of Dropping out of High School: Precision, Sensitivity, and Specificity

Scikit-learn: Machine Learning in Python

People with Complex Needs and the Criminal Justice System

Mental Health Problems of Prison and Jail Inmates: (557002006-001)

dssg/peeps-chili: Release for trade-offs submission

Statistics available from DonorsChoose

Top 10 ways your Machine Learning models may have leakage (Data Science for Social Good Blog

Author contributions

Building a Grad Nation (America’s

America's Promise Alliance

“Funding gaps 2018”

“Machine bias,”

predictive bias and disparate impact

What Do Teachers Spend on Supplies (Adopt a Classroom

Preventing Childhood Lead Poisoning

More Mentally Ill Persons Are in Jails and Prisons Than Hospitals: A Survey of the States

The Price We Pay: Economic and Social Consequences of Inadequate Education

Publisher's Note

methodology, contributed to the software and investigation and wrote the original draft