Video-LLaVA: Learning United Visual Representation by Alignment Before Projection (2023-11-16T00:00:00.000000Z)