A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation (2021-01-01T00:00:00.000000Z)

TL;DR

A simple and compact ViT architecture called UViT is proposed that achieves strong performance on COCO object detection and instance segmentation tasks and completes a scaling rule to optimize the model’s trade-off on accuracy and computation cost / model size.

Authors

Xiaohua Zhai

15 papers

Wuyang Chen

2 papers

Lucas Beyer

16 papers

Xianzhi Du

3 papers

Fan Yang

1 papers

Tsung-Yi Lin

1 papers

Huizhong Chen

1 papers

Jing Li

1 papers

Xiaodan Song

1 papers

Zhangyang Wang

1 papers

Denny Zhou

1 papers

A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation

TL;DR

Authors

Field of Study

Journal Information

Name

Volume

Venue Information

Name

Type

URL

Alternate Names