3260 papers • 126 benchmarks • 313 datasets
Given a story prefix and two possible endings, determining which one is the correct (coherent) ending of the story.
(Image credit: Papersgraph)
These leaderboards are used to track progress in story-completion-6
No benchmarks available.
Use these libraries to find story-completion-6 models and implementations
No subtasks available.
Visual storytelling and story comprehension are uniquely human skills that play a central role in how we learn about and experience the world. Despite remarkable progress in recent years in synthesis of visual and textual content in isolation and learning effective joint visual-linguistic representations, existing systems still operate only at a superficial, factual level. With the goal of developing systems that are able to comprehend rich human-generated narratives, and co-create new stories, we introduce AESOP: a new dataset that captures the creative process associated with visual storytelling. Visual panels are composed of clip-art objects with specific attributes enabling a broad range of creative expression. Using AESOP, we propose foundational storytelling tasks that are generative variants of story cloze tests, to better measure the creative and causal reasoning ability required for visual storytelling. We further develop a generalized story completion framework that models stories as the co-evolution of visual and textual concepts. We benchmark the proposed approach with human baselines and evaluate using comprehensive qualitative and quantitative metrics. Our results highlight key insights related to the dataset, modelling and evaluation of visual storytelling for future research in this promising field of study.
This paper presents a novel conditional variational autoencoder based on Transformer for missing plot generation that generates better story plots than state-of-the-art models in terms of readability, diversity and coherence.
This paper uses the recently introduced entmax transformation to train and sample from a natively sparse language model, avoiding this mismatch between training and testing conditions and proposing three new metrics for comparing sparse or truncated distributions: $\epsilon$-perplexity, sparsemax score, and Jensen-Shannon divergence.
Coins, a recursive inference framework that iteratively reads context sentences, dynamically generates contextualized inference rules, encodes them, and uses them to guide task-specific output generation, generates better story sentences than SOTA baselines, especially in terms of coherence.
Adding a benchmark result helps the community track progress.