Affordance Transfer Learning for Human-Object Interaction Detection (2021-04-07T00:00:00.000000Z)

TL;DR

An affordance transfer learning approach is introduced to jointly detect HOIs with novel object and recognize affordances, and is capable of inferring the affordances of novel objects from known affordance representations.

Abstract

Reasoning the human-object interactions (HOI) is essential for deeper scene understanding, while object affordances (or functionalities) are of great importance for human to discover unseen HOIs with novel objects. Inspired by this, we introduce an affordance transfer learning approach to jointly detect HOIs with novel object and recognize affordances. Specifically, HOI representations can be decoupled into a combination of affordance and object representations, making it possible to compose novel interactions by combining affordance representations and novel object representations from additional images, i.e. transferring the affordance to novel objects. With the proposed affordance transfer learning, the model is also capable of inferring the affordances of novel objects from known affordance representations. The proposed method can thus be used to 1) improve the performance of HOI detection, especially for the HOIs with unseen objects; and 2) infer the affordances of novel objects. Experimental results on two datasets, HICO-DET and HOI-COCO (from V-COCO), demonstrate significant improvements over recent state-of-the-art methods for HOI detection and object affordance detection. Code is available at https://github.com/zhihou7/HOI-CL.

Authors

Y. Qiao

19 papers

D. Tao

37 papers

Baosheng Yu

4 papers

TL;DR

Abstract

Authors

References45 items

Detecting Human-Object Interaction via Fabricated Compositional Learning

DRG: Dual Relation Graph for Human-Object Interaction Detection

UnionDet: Union-Level Detector Towards Real-Time Human-Object Interaction Detection

Amplifying Key Cues for Human-Object-Interaction Detection

ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection

Polysemy Deciphering Network for Robust Human–Object Interaction Detection

Visual Compositional Learning for Human-Object Interaction Detection

Detecting Human-Object Interactions with Action Co-occurrence Priors

Discovering Human Interactions With Novel Objects via Zero-Shot Learning

Detailed 2D-3D Joint Representation for Human-Object Interaction

Learning Human-Object Interaction Detection Using Interaction Points

VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions

PPDM: Parallel Point Detection and Matching for Real-Time Human-Object Interaction Detection

Objects365: A Large-Scale, High-Quality Dataset for Object Detection

Deep Contextual Attention for Human-Object Interaction Detection

Pose-Aware Multi-Level Feature Network for Human Object Interaction Detection

Learning to Detect Human-Object Interactions With Knowledge

Disambiguating Visual Verbs

Detecting Unseen Visual Relations Using Analogies

Transferable Interactiveness Prior for Human-Object Interaction Detection

No-Frills Human-Object Interaction Detection: Factorization, Layout Encodings, and Training Techniques

Interact as You Intend: Intention-Driven Human-Object Interaction Detection

iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection

Visual Affordance and Function Understanding

Demo2Vec: Reasoning Object Affordances from Online Videos

Scaling Human-Object Interaction Recognition Through Zero-Shot Learning

Detecting and Recognizing Human-Object Interactions

Learning to Detect Human-Object Interactions

HICO: A Benchmark for Recognizing Human-Object Interactions in Images

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Visual Semantic Role Labeling

Discovering Object Functionality

People Watching: Human Actions as a Cue for Single View Geometry

Human action recognition by learning bases of action attributes and parts

Recognition using visual phrases

Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition

eatable, carriable, ... flower american football durian baozi Figure 2. Examples of Non-COCO classes and its affordances

Comparison of object affordance recognition with HOI network among different datasets. Val2017 is the validation

Tensorﬂow: A system for large-scale machine learning

Design of Everyday Things

The Ecological Approach to Visual Perception

Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Recognizing Human-object Interactions in Still Images by Modeling the Mutual Context of Objects and Human Poses

Computational Vision and Active Perception

This Paper Is Included in the Proceedings of the 12th Usenix Symposium on Operating Systems Design and Implementation (osdi '16). Tensorflow: a System for Large-scale Machine Learning Tensorflow: a System for Large-scale Machine Learning

Learning One-Stage HOI detection

Field of Study

Journal Information

Name

Page

Venue Information

Name

Type

URL

Alternate Names