computer-vision-7

Embodied Question Answering

3260 papers • 126 benchmarks • 313 datasets

This task has no description! Would you like to contribute one?

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in embodied-question-answering-7

Trend

Dataset

Best Model

Actions

No benchmarks available.

Libraries

i

Use these libraries to find embodied-question-answering-7 models and implementations

Datasets

EQA

OpenEQA

Subtasks

No subtasks available.

Most implemented papers

Neural Modular Control for Embodied Question Answering

Devi Parikh, Dhruv Batra, Abhishek Das, Stefan Lee, Georgia Gkioxari•Mon Oct 22 2018

This work uses imitation learning to warm-start policies at each level of the hierarchy, dramatically increasing sample efficiency, followed by reinforcement learning, for learning policies for navigation over long planning horizons from language input.

141

Content

0

Paper Graph

Towards Learning a Generalist Model for Embodied Navigation

Duo Zheng, Shijia Huang, Lin Zhao, Yiwu Zhong, Liwei Wang•Sun Dec 03 2023

This work proposes the first generalist model for embodied navigation, NaviLLM, which adapts LLMs to em-bodied navigation by introducing schema-based instruction and demonstrates strong generalizability and presents im-pressive results on unseen tasks, e.g. embodied question answering and 3D captioning.

118 0

Paper Graph

CityEQA: A Hierarchical LLM Agent on Embodied Question Answering Benchmark in City Space

Yong Zhao, Kai Xu, Zhengqiu Zhu, Yue Hu, Zhiheng Zheng, Yingfeng Chen, Yatai Ji, Chen Gao, Yong Li, Jincai Huang•Mon Feb 17 2025

CityEQA, a new task where an embodied agent answers open-vocabulary questions through active exploration in dynamic city spaces, is introduced and the proposed Planner-Manager-Actor (PMA), a novel agent tailored for CityEQA, enables long-horizon planning and hierarchical task execution.

19 0

Paper Graph

VideoNavQA: Bridging the Gap between Visual and Embodied Question Answering

Aaron C. Courville, Cătălina Cangea, Eugene Belilovsky, Pietro Liò•Tue Aug 13 2019

The VideoNavQA dataset is built, which contains pairs of questions and videos generated in the House3D environment and establishes an initial understanding of how well VQA-style methods can perform within this novel EQA paradigm.

20 0

Paper Graph

Multi-Target Embodied Question Answering

Dhruv Batra, Xinlei Chen, Tamara L. Berg, Mohit Bansal, Licheng Yu, Georgia Gkioxari•Mon Apr 08 2019

This work presents a generalization of EQA -- Multi-Target EQA (MT-EQA), and proposes a modular architecture composed of a program generator, a controller, a navigator, and a VQA module that can outperform previous methods and strong baselines by a significant margin.

126 0

Paper Graph

Blindfold Baselines for Embodied QA

H. Larochelle, Aaron C. Courville, Ankesh Anand, Eugene Belilovsky, Kyle Kastner•Sun Nov 11 2018

It is shown through experiments on the EQAv1 dataset that a simple question-only baseline achieves state-of-the-art results on the EmbodiedQA task in all cases except when the agent is spawned extremely close to the object.

46 0

Paper Graph

Synthesizing Event-Centric Knowledge Graphs of Daily Activities Using Virtual Space

S. Egami, Takanori Ugai, Mikiko Oono, K. Kitamura, Ken Fukuda•Sat Jul 29 2023

The proposed VirtualHome2KG framework augments both the synthetic video data of daily activities and the contextual semantic data corresponding to the video contents based on the proposed event-centric schema and virtual space simulation results so that context-aware data can be analyzed, and various applications that have conventionally been difficult to develop due to the insufficient availability of relevant data and semantic information can be developed.

12 0

Paper Graph

AllenAct: A Framework for Embodied AI Research

Roozbeh Mottaghi, Aniruddha Kembhavi, Jordi Salvador, Luca Weihs, Unnat Jain, Klemen Kotar, Kuo-Hao Zeng•Thu Aug 27 2020

AllenAct is introduced, a modular and flexible learning framework designed with a focus on the unique requirements of Embodied AI research that provides first-class support for a growing collection of embodied environments, tasks and algorithms.

77 0

Paper Graph

MEIA: Multimodal Embodied Perception and Interaction in Unknown Environments

Guanbin Li, Liang Lin, Yang Liu, Xinshuai Song, Kaixuan Jiang, Weixing Chen, Jing-Hua Luo•Wed Jan 31 2024

This work proposes a novel Multimodal Environment Memory (MEM) module, facilitating the integration of embodied control with large models through the visual-language memory of scenes, and introduces the Multimodal Embodied Interactive Agent (MEIA), capable of translating high-level tasks expressed in natural language into a sequence of executable actions.

0 0

Paper Graph

Adding a benchmark result helps the community track progress.