Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

computer-vision-7

Dense Captioning

3260 papers • 126 benchmarks • 313 datasets

This task has no description! Would you like to contribute one?

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in dense-captioning-7

Trend

Dataset

Best Model

Actions

Visual Genome

Libraries

Use these libraries to find dense-captioning-7 models and implementations

Datasets

Visual Genome

Subtasks

No subtasks available.

Most implemented papers

3D-LLM: Injecting the 3D World into Large Language Models

Yilun Du, Zhenfang Chen, Chuang Gan, Yining Hong, Haoyu Zhen, Peihao Chen, Shuhong Zheng•Sat Dec 31 2022

This work proposes to inject the 3D world into large language models and introduce a whole new family of 3D-LLMs that can take 3D point clouds and their features as input and perform a diverse set of 3D-related tasks, including captioning, dense captioning, 3D question answering, task decomposition, 3D grounding, 3D-assisted dialog, navigation, and so on.

450

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

Paper Graph

Dense-Captioning Events in Videos

Li Fei-Fei, Ranjay Krishna, Juan Carlos Niebles, K. Hata, F. Ren•Mon May 01 2017

This work proposes a new model that is able to identify all events in a single pass of the video while simultaneously describing the detected events with natural language, and introduces a new captioning module that uses contextual information from past and future events to jointly describe all events.

1446 0

Paper Graph

A Hierarchical Approach for Generating Descriptive Image Paragraphs

Justin Johnson, Li Fei-Fei, Ranjay Krishna, J. Krause•Sat Nov 19 2016

A model that decomposes both images and paragraphs into their constituent parts is developed, detecting semantic regions in images and using a hierarchical recurrent neural network to reason about language.

402 0

Paper Graph

DenseCap: Fully Convolutional Localization Networks for Dense Captioning

Justin Johnson, Li Fei-Fei, A. Karpathy•Mon Nov 23 2015

A Fully Convolutional Localization Network (FCLN) architecture is proposed that processes an image with a single, efficient forward pass, requires no external regions proposals, and can be trained end-to-end with asingle round of optimization.

1210 0

Paper Graph

Dense Captioning with Joint Inference and Visual Context

Li-Jia Li, Jianchao Yang, L. Yang, K. Tang•Sun Nov 20 2016

A new model pipeline based on two novel ideas, joint inference and context fusion, is proposed, which achieves state-of-the-art accuracy on Visual Genome for dense captioning with a relative gain of 73% compared to the previous best algorithm.

177 0

Paper Graph

Joint Event Detection and Description in Continuous Video Streams

Kate Saenko, Vasili Ramanishka, Boyang Albert Li, L. Sigal, Huijuan Xu•Tue Feb 27 2018

The Joint Event Detection and Description Network (JEDDi-Net) is presented, which encodes the input video stream with three-dimensional convolutional layers, proposes variable- length temporal events based on pooled features, and then uses a two-level hierarchical LSTM module with context modeling to transcribe the event proposals into captions.

58 0

Paper Graph

Dense-Captioning Events in Videos: SYSU Submission to ActivityNet Challenge 2020

Teng Wang, Huicheng Zheng, Mingjing Yu•Sat Jun 20 2020

This technical report presents a brief description of the submission of a multi-event captioning model to the dense video captioning task of ActivityNet Challenge 2020, which achieves a 9.28 METEOR score on the test set.

11 0

Paper Graph

MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes

Zequn Jie, Yu-Gang Jiang, Yang Jiao, Shaoxiang Chen, Jingjing Chen, Lin Ma•Wed Mar 09 2022

This paper proposes MORE, a Multi-Order RElation mining model, to support generating more descriptive and comprehensive captions in 3D dense captioning, and outperform the current state-of-the-art method.

59 0

Paper Graph

Adding a benchmark result helps the community track progress.

Dense Captioning | State-of-the-Art