FedDANE: A Federated Newton-Type Method (2019-11-01T00:00:00.000000Z)

TL;DR

This work proposes FedDANE, an optimization method that is adapted from DANE, a method for classical distributed optimization, to handle the practical constraints of federated learning, and provides convergence guarantees for this method when learning over both convex and non-convex functions.

Abstract

Federated learning aims to jointly learn statistical models over massively distributed remote devices. In this work, we propose FedDANE, an optimization method that we adapt from DANE [8], [9], a method for classical distributed optimization, to handle the practical constraints of federated learning. We provide convergence guarantees for this method when learning over both convex and non-convex functions. Despite encouraging theoretical results, we find that the method has underwhelming performance empirically. In particular, through empirical simulations on both synthetic and real-world datasets, FedDANE consistently underperforms baselines of FedAvg [7] and FedProx [4] in realistic federated settings. We identify low device participation and statistical device heterogeneity as two underlying causes of this underwhelming performance, and conclude by suggesting several directions of future work.

Authors

M. Zaheer

8 papers

Ameet Talwalkar

10 papers

Tian Li

2 papers

FedDANE: A Federated Newton-Type Method

TL;DR

Abstract

Authors

References19 items

SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

SCAFFOLD: Stochastic Controlled Averaging for On-Device Federated Learning

Federated Learning: Challenges, Methods, and Future Directions

On the Convergence of FedAvg on Non-IID Data

Federated Optimization in Heterogeneous Networks

LEAF: A Benchmark for Federated Settings

Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms

Parallel Restarted SGD for Non-Convex Optimization with Faster Convergence and Less Communication

Local SGD Converges Fast and Communicates Little

Adaptive Federated Learning in Resource Constrained Edge Computing Systems

On the convergence properties of a K-step averaging stochastic gradient descent algorithm for nonconvex optimization

Federated Multi-Task Learning

CoCoA: A General Framework for Communication-Efficient Distributed Optimization

AIDE: Fast and Communication Efficient Distributed Optimization

Communication-Efficient Learning of Deep Networks from Decentralized Data

Deep learning with Elastic Averaging SGD

Communication-Efficient Distributed Optimization using an Approximate Newton-type Method

Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming

This Paper Is Included in the Proceedings of the 12th Usenix Symposium on Operating Systems Design and Implementation (osdi '16). Tensorflow: a System for Large-scale Machine Learning Tensorflow: a System for Large-scale Machine Learning

Field of Study

Journal Information

Name

Page

Venue Information

Name

Type

URL

Alternate Names