Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

reasoning-14

Video-based Generative Performance Benchmarking (Temporal Understanding)

3260 papers • 126 benchmarks • 313 datasets

The benchmark evaluates a generative Video Conversational Model with respect to Temporal Understanding. We curate a test set based on the ActivityNet-200 dataset, featuring videos with rich, dense descriptive captions and associated question-answer pairs from human annotations. We develop an evaluation pipeline using the GPT-3.5 model that assigns a relative score to the generated predictions on a scale of 1-5.

(Image credit: Papersgraph)