3260 papers • 126 benchmarks • 313 datasets
The generation of tabular data by any means possible.
(Image credit: Papersgraph)
These leaderboards are used to track progress in synthetic-data-generation-2
Use these libraries to find synthetic-data-generation-2 models and implementations
No subtasks available.
This work introduces a new algorithm named WGAN, an alternative to traditional GAN training that can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches.
This work proposes an alternative to clipping weights: penalize the norm of gradient of the critic with respect to its input, which performs better than standard WGAN and enables stable training of a wide variety of GAN architectures with almost no hyperparameter tuning.
A novel method of generating synthetic question answering corpora is introduced by combining models of question generation and answer extraction, and by filtering the results to ensure roundtrip consistency, establishing a new state-of-the-art on SQuAD2 and NQ.
This work explores if and how generative adversarial networks can be used to incentivize data sharing by enabling a generic framework for sharing synthetic datasets with minimal expert knowledge and designs a custom workflow called DoppelGANger, which achieves up to 43% better fidelity than baseline models.
This study presents a novel synthetic data generation pipeline, called SinGAN-Seg, to produce synthetic medical images with corresponding masks using a single training image, which is different from the traditional generative adversarial networks (GANs) because the model needs only a single image and the corresponding ground truth to train.
Cugen is presented, a modular procedure for synthetic data generation, capable of creating multidimensional clusters supported by line segments using arbitrary distributions, and has the potential to be a widely used framework in diverse clustering-related research tasks.
A novel sequence-to-sequence model for probabilistic human motion prediction, trained with a modified version of improved Wasserstein generative adversarial networks (WGAN-GP), in which the model learns a probability density function of future human poses conditioned on previous poses.
This work presents a visual localization system that learns to estimate camera poses in the real world with the help of synthetic data, and introduces CrossLoc, a cross-modal visual representation learning approach to pose estimation that makes full use of the scene coordinate ground truth via self-supervision.
A convolutional neural network architecture for predicting spatially varying kernels that can both align and denoise frames, a synthetic data generation approach based on a realistic noise formation model, and an optimization guided by an annealed loss function to avoid undesirable local minima are proposed.
This study proposes two occlusion generation techniques, Naturalistic Occlusion Generation (NatOcc), for producing high-quality naturalistic synthetic occluded faces; and Random Occlusions Generation (RandOcc), a more general synthetic Occluded data generation method.
Adding a benchmark result helps the community track progress.