computer-vision-7

Motion Captioning

3260 papers • 126 benchmarks • 313 datasets

Generating textual description for human motion.

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in motion-captioning-7

Trend

Dataset

Best Model

Actions

HumanML3D

KIT Motion-Language

Libraries

i

Use these libraries to find motion-captioning-7 models and implementations

Datasets

HumanML3D

KIT Motion-Language

Subtasks

No subtasks available.

Most implemented papers

MotionGPT: Human Motion as a Foreign Language

Tao Chen, Gang Yu, Jingyi Yu, Biao Jiang, Xin Chen, Wen Liu•Sun Jun 25 2023

This work proposes MotionGPT, a unified, versatile, and user-friendly motion-language model to handle multiple motion-relevant tasks that achieves state-of-the-art performances on multiple motion tasks including text-driven motion generation, motion captioning, motion prediction, and motion in-between.

462

Content

0

Paper Graph

TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts

Chuan Guo, X. Zuo, Sen Wang, Li Cheng•Sun Jul 03 2022

This paper aims to explore the generation of 3D human full-body motions from texts, as well as its reciprocal task, shorthanded for text2motion and motion2text, respectively, and proposes the use of motion token, a discrete and compact motion representation.

350 0

Paper Graph

Guided Attention for Interpretable Motion Captioning

Karim Radouane, Andon Tchechmedjiev, Sylvie Ranwez, Julien Lagarde•Tue Oct 10 2023

A novel architecture design that enhances text generation quality by emphasizing interpretability through spatio-temporal and adaptive attention mechanisms is introduced and methods for guiding attention during training, emphasizing relevant skeleton areas over time and distinguishing motion-related words are proposed.

4 0

Paper Graph

Motion2language, unsupervised learning of synchronized semantic motion segmentation

Karim Radouane, Andon Tchechmedjiev, Sylvie Ranwez, Julien Lagarde•Sun Oct 15 2023

It is found that both contributions to the attention mechanism and the encoder architecture additively improve the quality of generated text (BLEU and semantic equivalence), but also of synchronization.

5 0

Paper Graph

Adding a benchmark result helps the community track progress.