computer-vision-1

3D Question Answering (3D-QA)

3260 papers • 126 benchmarks • 313 datasets

This task has no description! Would you like to contribute one?

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in 3d-question-answering-3d-qa-1

Trend

Dataset

Best Model

Actions

ScanQA Test w/ objects

Libraries

i

Use these libraries to find 3d-question-answering-3d-qa-1 models and implementations

Datasets

No datasets available.

Subtasks

No subtasks available.

Most implemented papers

ScanQA: 3D Question Answering for Spatial Scene Understanding

Daich Azuma, Taiki Miyanishi, Shuhei Kurita, M. Kawanabe•Sun Dec 19 2021

A baseline model for 3D-QA is proposed, called the ScanQA11, which learns a fused descriptor from 3D object proposals and encoded sentence embeddings that correlates language expressions with the underlying geometric features of the 3D scan and facilitates the regression of 3D bounding boxes to determine the described objects in textual questions.

327

Content

0

Paper Graph

3D-LLM: Injecting the 3D World into Large Language Models

Yilun Du, Zhenfang Chen, Chuang Gan, Yining Hong, Haoyu Zhen, Peihao Chen, Shuhong Zheng•Sat Dec 31 2022

This work proposes to inject the 3D world into large language models and introduce a whole new family of 3D-LLMs that can take 3D point clouds and their features as input and perform a diverse set of 3D-related tasks, including captioning, dense captioning, 3D question answering, task decomposition, 3D grounding, 3D-assisted dialog, navigation, and so on.

450 0

Paper Graph

Towards Learning a Generalist Model for Embodied Navigation

Duo Zheng, Shijia Huang, Lin Zhao, Yiwu Zhong, Liwei Wang•Sun Dec 03 2023

This work proposes the first generalist model for embodied navigation, NaviLLM, which adapts LLMs to em-bodied navigation by introducing schema-based instruction and demonstrates strong generalizability and presents im-pressive results on unseen tasks, e.g. embodied question answering and 3D captioning.

118 0

Paper Graph

Adding a benchmark result helps the community track progress.