A Simple Single-Scale Vision Transformer for Object Localization and Instance Segmentation (2021-01-01T00:00:00.000000Z)