LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting
Introduced in LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting2023
In this work, we propose LargeST as a new benchmark dataset (see Figure 1), with the goal of facilitating the development of accurate and efficient methods in the context of large-scale traffic forecasting. The distinguishing characteristic of LargeST lies not only in its extensive graph size, encompassing a total of 8,600 sensors in California, but also in its substantial temporal coverage and rich node information – each sensor contains 5 years of data and comprehensive metadata.
LargeST comprises four sub-datasets, each characterized by a different number of sensors. The biggest one is California (CA), including a total number of 8,600 sensors. We also construct three subsets of CA by selecting three representative areas within CA and forming the sub-datasets of Greater Los Angeles (GLA), Greater Bay Area (GBA), and San Diego (SD).