3260 papers • 126 benchmarks • 313 datasets
Image-text retrieval refers to the process of finding relevant images based on textual descriptions or retrieving textual descriptions that are relevant to a given image. It's an interdisciplinary area that blends techniques from computer vision, natural language processing (NLP), and machine learning. The aim is to bridge the semantic gap between the visual information present in images and the textual descriptions that humans use to interpret them.
(Image credit: Papersgraph)
These leaderboards are used to track progress in image-to-text-retrieval-17
Use these libraries to find image-to-text-retrieval-17 models and implementations
No datasets available.
No subtasks available.
Adding a benchmark result helps the community track progress.