Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

computer-vision-4

GZSL Video Classification

3260 papers • 126 benchmarks • 313 datasets

Audio-visual zero-shot learning aims to recognize unseen categories based on paired audio-visual sequences.

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in gzsl-video-classification-4

Trend

Dataset

Best Model

Actions

UCF-GZSL(main)

VGGSound-GZSL(main)

ActivityNet-GZSL(main)

Libraries

Use these libraries to find gzsl-video-classification-4 models and implementations

Datasets

ActivityNet

Subtasks

No subtasks available.

Most implemented papers

Temporal and cross-modal attention for audio-visual zero-shot learning

Zeynep Akata, A. S. Koepke, Thomas Hummel, Otniel-Bogdan Mercea•Tue Jul 19 2022

It is shown that the proposed framework that ingests temporal features yields state-of-the-art performance on the \ucf, \vgg, and \activity benchmarks for (generalised) zero-shot learning.

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

ActivityNet-GZSL(main)

VGGSound-GZSL (cls)

UCF-GZSL (cls)

ActivityNet-GZSL (cls)

Paper Graph

Boosting Audio-visual Zero-shot Learning with Large Language Models

Haoxing Chen, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Yaohui Li, Jun Lan, Huijia Zhu, Weiqiang Wang•Mon Nov 20 2023

This paper first proposes to utilize the knowledge contained in large language models to generate numerous descriptive sentences that include important distinguishing audio-visual features of event classes, which helps to better understand unseen categories, and proposes a knowledge-aware adaptive margin loss to help distinguish similar events.

3 0

Paper Graph

Adding a benchmark result helps the community track progress.

GZSL Video Classification | State-of-the-Art