A new multilingual language model benchmark that is composed of 40+ languages spanning several scripts and linguistic families with around 40 billion characters is proposed, and the task of multilingual causal language modeling is introduced.
Authors
Rami Al-Rfou
7 papers
Zihang Dai
5 papers
Mandy Guo
3 papers
Denny Vrandečić
1 papers
Views
Field of Study
Computer Science
Venue Information
Name
International Conference on Language Resources and Evaluation