VinVL: Revisiting Visual Representations in Vision-Language Models (2021-06-01T00:00:00.000000Z)