3260 papers • 126 benchmarks • 313 datasets
longest6 is an evaluation benchmark for sensorimotor autonomous driving methods using the CARLA 0.9.10.1 simulator. It consists of 36 long routes in the publicly available Town 01-06 which, are populated with the maximum traffic density. The benchmark tests level 4 driving capabilities, methods are therefore allowed to train with data from the evaluation towns. Evaluation metrics follow the standard metrics from the CARLA leaderboard 1.0, except that stop sign infractions are not considered.
(Image credit: Papersgraph)
These leaderboards are used to track progress in carla-longest6-4
Use these libraries to find carla-longest6-4 models and implementations
No subtasks available.
This work learns an interactive vision-based driving policy from pre-recorded driving logs via a model-based approach, factorizing the dynamics into a non-reactive world model and a low-dimensional and compact forward model of the ego-vehicle.
This work proposes TransFuser, a mechanism to integrate image and LiDAR representations using self-attention, which outperforms all prior work on the CARLA leaderboard in terms of driving score by a large margin.
NEAT is a novel representation that enables end-to-end imitation learning models to selectively attend to relevant regions in the input while ignoring information irrelevant to the driving task, effectively associating the images with the BEV representation.
A system to train driving policies from experiences collected not just from the ego-vehicle, but all vehicles that it observes, which outperforms all prior methods on the public CARLA Leaderboard by a wide margin.
An integrated approach that has two branches for trajectory planning and direct control, respectively, and involves a novel multi-step prediction scheme such that the relationship between current actions and future states can be reasoned.
A safety-enhanced autonomous driving framework, named Interpretable Sensor Fusion Transformer (InterFuser), to fully process and fuse information from multi-modal multi-view sensors for achieving comprehensive scene understanding and adversarial event detection is proposed.
A novel knowledge distillation framework for effectively teaching a sensorimotor student agent to drive from the supervision of a privileged teacher agent and designs a student which learns to align their input features with the teacher's privileged Bird's Eye View (BEV) space.
The above refinement module could be stacked in a cascaded fashion, which extends the capacity of the decoder with spatial-temporal prior knowledge about the conditioned future, and achieves state-of-the-art performance in closed-loop benchmarks.
Results indicate that PlanT can focus on the most relevant object in the scene, even when this object is geometrically distant, and an evaluation protocol is proposed to quantify the ability of planners to identify relevant objects, providing insights regarding their decision-making.
Adding a benchmark result helps the community track progress.