Videos, Titles and Comments
Introduced in VTC: Improving Video-Text Retrieval with User Comments2022
VTC is a large-scale multimodal dataset containing video-caption pairs (~300k) alongside comments that can be used for multimodal representation learning.