3260 papers • 126 benchmarks • 313 datasets
Visual Entailment (VE) - is a task consisting of image-sentence pairs whereby a premise is defined by an image, rather than a natural language sentence as in traditional Textual Entailment tasks. The goal is to predict whether the image semantically entails the text.
(Image credit: Open Source)
These leaderboards are used to track progress in visual-entailment-34
Use these libraries to find visual-entailment-34 models and implementations
No datasets available.
No subtasks available.
Adding a benchmark result helps the community track progress.