computer-vision-8

Referring expression generation

3260 papers • 126 benchmarks • 313 datasets

Generate referring expressions

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in referring-expression-generation-8

Trend

Dataset

Best Model

Actions

No benchmarks available.

Libraries

i

Use these libraries to find referring-expression-generation-8 models and implementations

Datasets

A Game Of Sorts

Subtasks

No subtasks available.

Most implemented papers

Modeling Context in Referring Expressions

A. Berg, Tamara L. Berg, Licheng Yu, Patrick Poirson, Shan Yang•Sat Jul 30 2016

This work focuses on incorporating better measures of visual context into referring expression models and finds that visual comparison to other objects within an image helps improve performance significantly.

1544

Content

0

Paper Graph

Kosmos-2: Grounding Multimodal Large Language Models to the World

Shaohan Huang, Li Dong, Furu Wei, Wenhui Wang, Shuming Ma, Zhiliang Peng, Y. Hao•Sun Jun 25 2023

Kosmos-2, a Multimodal Large Language Model (MLLM), is introduced, enabling new capabilities of perceiving object descriptions and grounding text to the visual world and sheds light on the big convergence of language, multimodal perception, action, and world modeling.

1049 0

Paper Graph

NeuralREG: An end-to-end approach to referring expression generation

Ákos Kádár, Thiago Castro Ferreira, Diego Moussallem, Sander Wubben, Emiel Krahmer•Mon Apr 30 2018

This paper presents a new approach (NeuralREG), relying on deep neural networks, which makes decisions about form and content in one go without explicit feature extraction, using a delexicalized version of the WebNLG corpus.

42 0

Paper Graph

Enriching the WebNLG corpus

Thiago Castro Ferreira, Diego Moussallem, Sander Wubben, Emiel Krahmer•Wed Oct 31 2018

The enrichment of WebNLG corpus is described with the aim to further extend its usefulness as a resource for evaluating common NLG tasks, including Discourse Ordering, Lexicalization and Referring Expression Generation.

57 0

Paper Graph

Referring Expression Generation Using Entity Profiles

J. Cheung, Mengyao Cao•Tue Sep 03 2019

A profile-based deep neural network model is proposed, ProfileREG, which encodes both the local context and an external profile of the entity to generate reference realizations and generates tokens by learning to choose between generating pronouns, generating from a fixed vocabulary, or copying a word from the profile.

16 0

Paper Graph

Improving Quality and Efficiency in Plan-based Neural Data-to-text Generation

Amit Moryossef, Yoav Goldberg, Ido Dagan•Sat Sep 21 2019

A trainable neural planning component is introduced that can generate effective plans several orders of magnitude faster than the original planner and a verification-by-reranking stage that substantially improves the faithfulness of the resulting texts is introduced.

18 0

Paper Graph

Pento-DIARef: A Diagnostic Dataset for Learning the Incremental Algorithm for Referring Expression Generation from Examples

David Schlangen, Philipp Sadler•Tue May 23 2023

Pento-DIARef is presented, a diagnostic dataset in a visual domain of puzzle pieces where referring expressions are generated by a well-known symbolic algorithm (the “Incremental Algorithm”), which itself is motivated by appeal to a hypothesised capability (eliminating distractors through application of Gricean maxims).

3 0

Paper Graph

Whether you can locate or not? Interactive Referring Expression Generation

Xiaojie Wang, Fangxiang Feng, Fulong Ye, Yuxing Long•Fri Aug 18 2023

An Interactive REG (IREG) model is proposed that can interact with a real REC model, utilizing signals indicating whether the object is located and the visual region located by the REC model to gradually modify REs.

9 0

Paper Graph

Collecting Visually-Grounded Dialogue with A Game Of Sorts

Bram Willemsen, Dmytro Kalpakchi, G. Skantze•Sat Sep 09 2023

This work introduces a collaborative image ranking task, a grounded agreement game that players are tasked with reaching agreement on how to rank a set of images given some sorting criterion through a largely unrestricted, role-symmetric dialogue.

2 0

Paper Graph

GLaMM: Pixel Grounding Large Multimodal Model

R. Anwer, Hisham Cholakkal, Ming-Hsuan Yang, F. Khan, Muhammad Maaz, H. Rasheed, Sahal Shaji Mullappilly, Salman H. Khan, Abdelrahman M. Shaker, Eric P. Xing•Sun Nov 05 2023

This work presents Grounding LMM (GLaMM), the first model that can generate natural language responses seamlessly in-tertwined with corresponding object segmentation masks and is flexible enough to accept both textual and optional visual prompts (region of interest) as input.

411 0

Paper Graph

Adding a benchmark result helps the community track progress.