Research Connect

TL;DR

This is the first comprehensive survey in the efficient deep learning space that covers the landscape of model efficiency from modeling techniques to hardware support and the seminal work there.

Abstract

Deep learning has revolutionized the fields of computer vision, natural language understanding, speech recognition, information retrieval, and more. However, with the progressive improvements in deep learning models, their number of parameters, latency, and resources required to train, among others, have all increased significantly. Consequently, it has become important to pay attention to these footprint metrics of a model as well, not just its quality. We present and motivate the problem of efficiency in deep learning, followed by a thorough survey of the five core areas of model efficiency (spanning modeling techniques, infrastructure, and hardware) and the seminal work there. We also present an experiment-based guide along with code for practitioners to optimize their model training and deployment. We believe this is the first comprehensive survey in the efficient deep learning space that covers the landscape of model efficiency from modeling techniques to hardware support. It is our hope that this survey would provide readers with the mental model and the necessary understanding of the field to apply generic efficiency techniques to immediately get significant improvements, and also equip them with ideas for further research and experimentation to achieve additional gains.

Authors

Gaurav Menghani

1 Paper

Datasets

CIFAR-10

Canadian Institute for Advanced Research, 10 classes

ImageNet

References180 items

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Computer ScienceMathematics

Deep Residual Learning for Image Recognition

ImageNet: A large-scale hierarchical image database

This Paper Is Included in the Proceedings of the 12th Usenix Symposium on Operating Systems Design and Implementation (osdi '16). Tensorflow: a System for Large-scale Machine Learning Tensorflow: a System for Large-scale Machine Learning

Computer Science

TL;DR

Abstract

Authors

Datasets

CIFAR-10

ImageNet

References180 items

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Deep Residual Learning for Image Recognition

ImageNet: A large-scale hierarchical image database

This Paper Is Included in the Proceedings of the 12th Usenix Symposium on Operating Systems Design and Implementation (osdi '16). Tensorflow: a System for Large-scale Machine Learning Tensorflow: a System for Large-scale Machine Learning

Xception: Deep Learning with Depthwise Separable Convolutions

Going deeper with convolutions

ImageNet classification with deep convolutional neural networks

Gradient-based learning applied to document recognition

High-Performance Neural Networks for Visual Object Classification

Learning both Weights and Connections for Efficient Neural Network

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Big Self-Supervised Models are Strong Semi-Supervised Learners

Language Models are Few-Shot Learners

A Simple Framework for Contrastive Learning of Visual Representations

Self-Training With Noisy Student Improves ImageNet Classification

Unsupervised Representation Learning by Predicting Image Rotations

mixup: Beyond Empirical Risk Minimization

Revisiting Unreasonable Effectiveness of Data in Deep Learning Era

Unsupervised Visual Representation Learning by Context Prediction

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

On Bayesian Methods for Seeking the Extremum

Attention is All you Need

SMOTE: Synthetic Minority Over-sampling Technique

Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization

Random Search for Hyper-Parameter Optimization

Multiple Classifier Systems

MnasNet: Platform-Aware Neural Architecture Search for Mobile

Learning Multiple Layers of Features from Tiny Images

GloVe: Global Vectors for Word Representation

Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations

Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding

Efficient Neural Architecture Search via Parameter Sharing

Large-scale deep unsupervised learning using graphics processors

Very Deep Convolutional Networks for Large-Scale Image Recognition

DARTS: Differentiable Architecture Search

The Bottleneck

Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges

Distilling the Knowledge in a Neural Network

Randaugment: Practical automated data augmentation with a reduced search space

Learning Transferable Architectures for Scalable Image Recognition

AutoAugment: Learning Augmentation Strategies From Data

Data Augmentation by Pairing Samples for Images Classification

An Attentive Survey of Attention Models

Neural Machine Translation by Jointly Learning to Align and Translate

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Optimal Brain Damage

Efficient Transformers: A Survey

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

Neural Architecture Search with Reinforcement Learning

MobileNetV2: Inverted Residuals and Linear Bottlenecks

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Hierarchical Text-Conditional Image Generation with CLIP Latents

PaLM: Scaling Language Modeling with Pathways

The Efficiency Misnomer

MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer

A comprehensive survey on optimizing deep learning models by metaheuristics

Distilling Large Language Models into Tiny and Effective Students using pQRNN

Amazon SageMaker Automatic Model Tuning: Scalable Black-box Optimization

Characterising Bias in Compressed Models

A Survey on Deep Neural Network Compression: Challenges, Overview, and Solutions

Towards Accurate Post-training Network Quantization via Bit-Split and Stitching

Neural Structured Learning: Training Neural Networks with Structured Signals

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

Exploring Bayesian Optimization

Training with Quantization Noise for Extreme Model Compression

ProFormer: Towards On-Device LSH Projection Based Transformers

MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Multi-modal Self-Supervision from Generalized Data Transformations

TinyML: Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers

Rigging the Lottery: Making All Tickets Winners

Fast Sparse ConvNets