Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models (2023-12-11T00:00:00.000000Z)