Video Instance Segmentation

The goal of video instance segmentation is simultaneous detection, segmentation and tracking of instances in videos. In words, it is the first time that the image instance segmentation problem is extended to the video domain. To facilitate research on this new task, a large-scale benchmark called YouTube-VIS, which consists of 2,883 high-resolution YouTube videos, a 40-category label set and 131k high-quality instance masks is built.

Benchmarks

Libraries

Datasets

Subtasks

Most implemented papers

Simple online and realtime tracking with a deep association metric

Content

Video Instance Segmentation

Instances as Queries

Mask2Former for Video Instance Segmentation

Temporally Efficient Vision Transformer for Video Instance Segmentation

End-to-End Video Instance Segmentation with Transformers

Occluded Video Instance Segmentation: A Benchmark

Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations

UVO Challenge on Video-based Open-World Segmentation 2021: 1st Place Solution

D2Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos