This paper introduces ELECTRA-style tasks to cross-lingual language model pre-training, and pretrain the model, named as XLM-E, on both multilingual and parallel corpora, which outperforms the baseline models on various cross-lingsual understanding tasks with much less computation cost.
In this paper, we introduce ELECTRA-style tasks to cross-lingual language model pre-training. Specifically, we present two pre-training tasks, namely multilingual replaced token detection, and translation replaced token detection. Besides, we pretrain the model, named as XLM-E, on both multilingual and parallel corpora. Our model outperforms the baseline models on various cross-lingual understanding tasks with much less computation cost. Moreover, analysis shows that XLM-E tends to obtain better cross-lingual transferability.
Zewen Chi
5 papers
Furu Wei
37 papers
Shuming Ma
3 papers
Saksham Singhal
3 papers
Payal Bajaj
1 papers