The Semantic Scholar corpus (S2) is composed of titles from scientific papers published in machine learning conferences and journals from 1985 to 2017, split by year (33 timesteps).
Source: Learning Dynamic Author Representations with Temporal Language Models Image Source: [http://s2-public-api-prod.us-west-2.elasticbeanstalk.com/corpus/] (http://s2-public-api-prod.us-west-2.elasticbeanstalk.com/corpus/)