Introduced in TLDR9+: A Large Scale Resource for Extreme Summarization of Social Media Posts2021
TLDR9+ is a large-scale summarization dataset containing over 9 million training instances extracted from Reddit discussion forum. This dataset is specifically gathered to perform extreme summarization (i.e., generating one-sentence summary in high compression and abstraction) and is more than twice larger than the previously proposed dataset. With the help of human annotations, a more fine-grained dataset is distilled by sampling High-Quality instances from TLDR9+ and call it TLDRHQ. dataset.
Image source: https://arxiv.org/pdf/2110.01159v1.pdf
Unknown