3260 papers • 126 benchmarks • 313 datasets
Data Free Quantization is a technique to achieve a highly accurate quantized model without accessing any training data. Source: Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples
(Image credit: Papersgraph)
These leaderboards are used to track progress in data-free-quantization-9
Use these libraries to find data-free-quantization-9 models and implementations
No subtasks available.
This work introduces a data-free quantization method for deep neural networks that does not require fine-tuning or hyperparameter selection, and achieves near-original model performance on common computer vision architectures and tasks.
THE AUTHORS' enables mixed-precision quantization without any access to the training or validation data, and it can finish the entire quantization process in less than 30s, which is very low computational overhead.
This paper proposes a knowledge matching generator to produce meaningful fake data by exploiting classification boundary knowledge and distribution information in the pre-trained model with much higher accuracy on 4-bit quantization than the existing data free quantization method.
Qimera, a method that uses superposed latent embeddings to generate synthetic boundary supporting samples and uses an additional disentanglement mapping layer and extracting information from the full-precision model to better reflect the original distribution is proposed.
This paper presents a generic Diverse Sample Generation (DSG) scheme for the generative data-free quantization, to mitigate detrimental homogenization and strengthens the loss impact of the specific BN layers for different samples and inhibit the correlation among samples in the generation process.
This paper proposes an on-the-fly DFQ framework with sub-second quantization time, called SQuant, which can quantize networks on inference-only devices with low computation and memory requirements, and proposes a novel data-free optimization objective in the discrete domain, minimizing Constrained Absolute Sum of Error.
This paper proposes PSAQ-ViT, a Patch Similarity Aware data-free Quantization framework for Vision Transformers, to enable the generation of "realistic" samples based on the vision transformer's unique properties for calibrating the quantization parameters.
AIT is proposed, a simple yet powerful technique for zero-shot quantization, which addresses the aforementioned two problems in the following way: AIT uses a KL distance loss only without a cross-entropy loss, and manipulates gradients to guarantee that a certain portion of weights are properly updated after crossing the rounding thresholds.
A post-training quantization scheme for zero-shot quantization that produces high-quality quantized networks within a few hours and a post- training quantization algorithm to enhance the performance of quantized models are introduced that can bridge the gap between zero- shot and few- shot quantization while significantly improving the quantization performance compared to that of existing approaches.
Adding a benchmark result helps the community track progress.