3260 papers • 126 benchmarks • 313 datasets
A task where an agent should play the $DE$ role and generate a text to respond to a $P$ message.
(Image credit: Papersgraph)
These leaderboards are used to track progress in response-generation-4
Use these libraries to find response-generation-4 models and implementations
No subtasks available.
A neural network-based generative architecture, with stochastic latent variables that span a variable number of time steps, that improves upon recently proposed models and that the latent variables facilitate both the generation of meaningful, long and diverse responses and maintaining dialogue state is proposed.
This work proposes using Maximum Mutual Information (MMI) as the objective function in neural models, and demonstrates that the proposed MMI models produce more diverse, interesting, and appropriate responses, yielding substantive gains in BLEU scores on two conversational datasets and in human evaluations.
A new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks that compares favorably with BERT on the GLUE benchmark, and the SQuAD 2.0 and CoQA question answering tasks.
This work proposes MAsked Sequence to Sequence pre-training (MASS) for the encoder-decoder based language generation tasks, which achieves the state-of-the-art accuracy on the unsupervised English-French translation, even beating the early attention-based supervised model.
It is shown that automatic metrics provide a better guidance than human on discriminating system-level performance in Text Summarization and Controlled Generation tasks, and that multi-aspect human-aligned metric (UniEval) is not necessarily dominant over single-aspects human- aligned metrics (CTC, CtrlEval), and task-agnostic metrics (BLEU, BERTScore), particularly in Controlled generation tasks.
A new class of models called multiresolution recurrent neural networks, which explicitly model natural language generation at multiple levels of abstraction, are introduced, which outperform competing models by a substantial margin and generate more fluent, relevant and goal-oriented responses.
This work introduces the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains, and presents a schema-guided paradigm for task-oriented dialogue, in which predictions are made over a dynamic set of intents and slots provided as input.
Adding a benchmark result helps the community track progress.