Simple Open-Vocabulary Object Detection with Vision Transformers - Citation Graph | Papersgraph