Generation, Evaluation, and Metrics
Introduced in The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics2021
Generation, Evaluation, and Metrics (GEM) is a benchmark environment for Natural Language Generation with a focus on its Evaluation, both through human annotations and automated Metrics.
GEM aims to:
It is our goal to regularly update GEM and to encourage toward more inclusive practices in dataset development by extending existing data or developing datasets for additional languages.
Source: https://gem-benchmark.com/ Image Source: Gehrmann et al