Home Research Papers Datasets State of the Art Pricing

Discover, visualize, and connect AI research papers. Explore the latest trends and insights in artificial intelligence research.

Product

Home
Research Papers
About

Support

Contact
Terms of Service
Privacy Policy

© 2026 Papersgraph. All rights reserved.

computer-vision-7

Multi-label zero-shot learning

3260 papers • 126 benchmarks • 313 datasets

The goal of multi-label classification task is to predict a set of labels in an image. As an extension of zero-shot learning (ZSL), multi-label zero-shot learning (ML-ZSL) is developed to identify multiple seen and unseen labels in an image.

(Image credit: Papersgraph)

Benchmarks

These leaderboards are used to track progress in multi-label-zero-shot-learning-7

Trend

Dataset

Best Model

Actions

NUS-WIDE

NUS-WIDE

Open Images V4

Open Images V4

ImageNet-1k to MSCOCO

Libraries

i

Use these libraries to find multi-label-zero-shot-learning-7 models and implementations

Datasets

NUS-WIDE

Open Images V4

Subtasks

No subtasks available.

Most implemented papers

Label-Embedding for Image Classification

Zaïd Harchaoui, C. Schmid, Zeynep Akata, Florent Perronnin•Sun Mar 29 2015

This work proposes to view attribute-based image classification as a label-embedding problem: each class is embedded in the space of attribute vectors, and introduces a function that measures the compatibility between an image and a label embedding.

836

Content

Introduction Benchmarks Datasets Subtasks Libraries Papers

ImageNet-1k to MSCOCO

0

Multi-label Zero-Shot Learning with Structured Knowledge Graphs

Y. Wang, Chih-Kuan Yeh, Chung-Wei Lee, Wei Fang•Thu Nov 16 2017

A novel deep learning architecture for multi-label zero-shot learning (ML-ZSL), which is able to predict multiple unseen class labels for each input instance, and a framework that incorporates knowledge graphs for describing the relationships between multiple labels is proposed.

301 0

Zero-shot Learning for Audio-based Music Classification and Tagging

Juhan Nam, Jeong Choi, Jongpil Lee, Jiyoung Park•Thu Jul 04 2019

This work investigates the zero-shot learning in the music domain and organizes two different setups of side information using human-labeled attribute information based on Free Music Archive and OpenMIC-2018 datasets and general word semantic information from Million Song Dataset and this http URL tag annotations.

49 0

A Shared Multi-Attention Framework for Multi-Label Zero-Shot Learning

Ehsan Elhamifar, Dat T. Huynh•Sun May 31 2020

In this work, we develop a shared multi-attention model for multi-label zero-shot learning. We argue that designing attention mechanism for recognizing multiple seen and unseen labels in an image is a non-trivial task as there is no training signal to localize unseen labels and an image only contains a few present labels that need attentions out of thousands of possible labels. Therefore, instead of generating attentions for unseen labels which have unknown behaviors and could focus on irrelevant regions due to the lack of any training sample, we let the unseen labels select among a set of shared attentions which are trained to be label-agnostic and to focus on only relevant/foreground regions through our novel loss. Finally, we learn a compatibility function to distinguish labels based on the selected attention. We further propose a novel loss function that consists of three components guiding the attention to focus on diverse and relevant image regions while utilizing all attention features. By extensive experiments, we show that our method improves the state of the art by 2.9% and 1.4% F1 score on the NUS-WIDE and the large scale Open Images datasets, respectively.

104 0

Interaction Compass: Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations

Ehsan Elhamifar, Dat T. Huynh•Thu Sep 30 2021

We study the problem of multi-label zero-shot recognition in which labels are in the form of human-object interactions (combinations of actions on objects), each image may contain multiple interactions and some interactions do not have training images. We propose a novel compositional learning framework that decouples interaction labels into separate action and object scores that incorporate the spatial compatibility between the two components. We combine these scores to efficiently recognize seen and unseen interactions. However, learning action-object spatial relations, in principle, requires bounding-box annotations, which are costly to gather. Moreover, it is not clear how to generalize spatial relations to unseen interactions. We address these challenges by developing a cross-attention mechanism that localizes objects from action locations and vice versa by predicting displacements between them, referred to as relational directions. During training, we estimate the relational directions as ones maximizing the scores of ground-truth interactions that guide predictions toward compatible action-object regions. By extensive experiments, we show the effectiveness of our framework, where we improve the state of the art by 2.6% mAP score and 5.8% recall score on HICO and Visual Genome datasets, respectively.1

12 0

Generative Multi-Label Zero-Shot Learning

Ling Shao, Salman Hameed Khan, F. Khan, J. Weijer, Sanath Narayan, Akshita Gupta•Tue Jan 26 2021

This work is the first to tackle the problem of multi-label feature synthesis in the (generalized) zero-shot setting with a cross-level fusion-based generative approach, which outperforms the state-of-the-art on three zero- shot benchmarks: NUS-WIDE, Open Images and MS COCO.

41 0

Semantic Diversity Learning for Zero-Shot Multi-label Classification

Avi Ben-Cohen, Lihi Zelnik-Manor, Emanuel Ben Baruch, Nadav Zamir, Itamar Friedman•Tue May 11 2021

This study introduces an end-to-end model training for multi-label zero-shot learning that supports the semantic diversity of the images and labels and proposes to use an embedding matrix having principal embedding vectors trained using a tailored loss function.

43 0

Discriminative Region-based Multi-Label Zero-Shot Learning

Ling Shao, M. Shah, Salman Hameed Khan, F. Khan, Sanath Narayan, Akshita Gupta•Thu Aug 19 2021

This work proposes an alternate approach towards region-based discriminability-preserving multi-label zero-shot classification that maintains the spatial resolution to preserve region-level characteristics and utilizes a bi-level attention module (BiAM) to enrich the features by incorporating both region and scene context information.

58 0

Adding a benchmark result helps the community track progress.

Multi-label zero-shot learning | State-of-the-Art