3260 papers • 126 benchmarks • 313 datasets
The goal of 3D Lane Detection is to perceive lanes that provide guidance for autonomous vehicles. A lane can be represented as a visible laneline or a conceptual centerline. Furthermore, a lane obtains extra attributes from the understanding of the surrounding environment. ( Image credit: OpenLane-V2 )
(Image credit: Papersgraph)
These leaderboards are used to track progress in 3d-lane-detection-10
Use these libraries to find 3d-lane-detection-10 models and implementations
No subtasks available.
It is shown that PersFormer significantly outperforms competitive baselines in the 3D lane detection task on the authors' new OpenLane dataset as well as Apollo 3D Lane Synthetic dataset, and is also on par with state-of-the-art algorithms in the 2D task on OpenL Lane.
This work presents ONCE-3DLanes, a real-world autonomous driving dataset with lane layout annotation in 3D space, and presents an extrinsic-free, anchorfree method, called SALAD, regressing the 3D coordinates of lanes in image view without converting the feature map into the bird's-eye view (BEV).
This work introduces an end-to-end vectorized HD map learning pipeline, termed VectorMapNet, which takes onboard sensor observations and predicts a sparse set of polylines in the bird's-eye view and can explicitly model the spatial relation between map elements and generate vectorized maps that are friendly to downstream autonomous driving tasks.
This work marks a first attempt to address this task with on-board sensing without assuming a known constant lane width or relying on pre-mapped environments, and applies two new concepts: intra-network inverse-perspective mapping (IPM) and anchor-based lane representation.
This work introduces a new geometry-guided lane anchor representation in a new coordinate frame and applies a specific geometric transformation to directly calculate real 3D lane points from the network output and presents a scalable two-stage framework that decouples the learning of image segmentation subnetwork and geometry encoding subnetwork.
This work proposes to predict 3D lanes by estimating camera pose from a single image with a two-stage framework and demonstrates that, without ground truth camera pose, this method outperforms the state-of-the-art perfect-camera-pose-based methods and has the fewest parameters and computations.
PETRv2, a unified framework for 3D perception from multi-view images, is proposed and the 3D position embedding in PETR is extended for temporal modeling, which achieves the temporal alignment on object position of different frames.
A unified permutation-equivalent modeling approach is proposed, i.e., modeling map element as a point set with a group of equivalent permutations, which accurately describes the shape of map element and stabilizes the learning process.
In anchor representation, a double-layer anchor with non-maximum suppression (NMS) method is proposed, which enables the anchor-based method to predict two lane lines that are close and is the first try of 3D lane detection under weakly supervised setting.
Adding a benchmark result helps the community track progress.