This work proposes a two-tower pre-training model called BriVL within the cross-modal contrastive learning framework that outperforms both UNITER and OpenAI CLIP on various downstream tasks and builds a large queue-based dictionary that can incorporate more negative samples in limited GPU resources.
Multi-modal pre-training models have been intensively explored to bridge vision and language in recent years. However, most of them explicitly model the cross-modal interaction between image-text pairs, by assuming that there exists strong semantic correlation between the text and image modalities. Since this strong assumption is often invalid in real-world scenarios, we choose to implicitly model the cross-modal correlation for large-scale multi-modal pre-training, which is the focus of the Chinese project `WenLan' led by our team. Specifically, with the weak correlation assumption over image-text pairs, we propose a two-tower pre-training model called BriVL within the cross-modal contrastive learning framework. Unlike OpenAI CLIP that adopts a simple contrastive learning method, we devise a more advanced algorithm by adapting the latest method MoCo into the cross-modal scenario. By building a large queue-based dictionary, our BriVL can incorporate more negative samples in limited GPU resources. We further construct a large Chinese multi-source image-text dataset called RUC-CAS-WenLan for pre-training our BriVL model. Extensive experiments demonstrate that the pre-trained BriVL model outperforms both UNITER and OpenAI CLIP on various downstream tasks.
Jinming Zhao
3 papers
Zhicheng Dou
13 papers
Wayne Xin Zhao
8 papers
Ji-rong Wen
11 papers
Yanyan Lan
4 papers
Xin Hong
2 papers
Yuqing Song
2 papers
Shizhe Chen
7 papers
Qin Jin
4 papers
Yuchong Sun
2 papers
Ruihua Song
2 papers
Anwen Hu
2 papers
Haoyu Lu
2 papers
Yuqi Huo
3 papers
Guoxing Yang
2 papers
Zhiwu Lu
4 papers
Yida Zhao
2 papers
Junyi Li
4 papers
Manli Zhang
1 papers
Guangzhen Liu
1 papers
Yizhao Gao
1 papers
Jing Wen
1 papers
Baogui Xu
1 papers
Weihao Zheng
1 papers
Zongzheng Xi
1 papers
Liang Zhang
1 papers
Wanqing Cui
1 papers
Danyang Hou
1 papers
Yingyan Li
1 papers
Peiyu Liu
1 papers
Zheng Gong
1 papers
Chu Jin
1 papers