3260 papers • 126 benchmarks • 313 datasets
Sensor fusion is the process of combining sensor data or data derived from disparate sources such that the resulting information has less uncertainty than would be possible when these sources were used individually. [Wikipedia]
(Image credit: Papersgraph)
These leaderboards are used to track progress in sensor-fusion
No benchmarks available.
Use these libraries to find sensor-fusion models and implementations
No subtasks available.
The use of targets of known dimension and geometry to ameliorate target pose estimation in face of the quantization and systematic errors inherent in aLiDAR image of a target, and a fitting method for the LiDAR to monocular camera transformation that avoids the tedious task of target edge extraction from the point cloud.
Image-based fiducial markers are useful in problems such as object tracking in cluttered or textureless environments, camera (and multi-sensor) calibration tasks, and vision-based simultaneous localization and mapping (SLAM). The state-of-the-art fiducial marker detection algorithms rely on the consistency of the ambient lighting. This article introduces LiDARTag, a novel fiducial tag design and detection algorithm suitable for light detection and ranging (LiDAR) point clouds. The proposed method runs in real-time and can process data at 100 Hz, which is faster than the currently available LiDAR sensor frequencies. Because of the LiDAR sensors’ nature, rapidly changing ambient lighting will not affect the detection of a LiDARTag; hence, the proposed fiducial marker can operate in a completely dark environment. In addition, the LiDARTag nicely complements and is compatible with existing visual fiducial markers, such as AprilTags, allowing for efficient multi-sensor fusion and calibration tasks. We further propose a concept of minimizing a fitting error between a point cloud and the marker's template to estimate the marker's pose. The proposed method achieves millimeter error in translation and a few degrees in rotation. Due to LiDAR returns’ sparsity, the point cloud is lifted to a continuous function in a reproducing kernel Hilbert space where the inner product can be used to determine a marker's ID. The experimental results, verified by a motion capture system, confirm that the proposed method can reliably provide a tag's pose and unique ID code. The rejection of false positives is validated on the Google Cartographer indoor dataset and the Honda H3D outdoor dataset. All implementations are coded in C++ and are available at https://github.com/UMich-BipedLab/LiDARTag.
This paper proposes a sensor fusion framework to fuse local states with global sensors, which achieves locally accurate and globally drift-free pose estimation and highlights that this system is a general framework, which can easily fuse various global sensors in a unified pose graph optimization.
PointPainting is proposed, a sequential fusion method that combines lidar points into the output of an image-only semantic segmentation network and appending the class scores to each point, and how latency can be minimized through pipelining.
A new method based on binary fuzzy measures, which reduces the search space and significantly improves the efficiency of the MIMRF framework is proposed, which can effectively and efficiently perform multi-resolution fusion given remote sensing data with uncertainty.
MonoLayout is presented, a deep neural network for realtime amodal scene layout estimation from a single image that represents scene layout as a multi-channel semantic occupancy grid, and leverages adversarial feature learning to hallucinate plausible completions for occluded image parts.
This work evaluates PointFusion on two distinctive datasets: the KITTI dataset that features driving scenes captured with a lidar-camera setup, and the SUN-RGBD dataset that captures indoor environments with RGB-D cameras.
The proposed R3LIVE system is able to reconstruct the precise, dense, 3D, RGB-colored maps of the surrounding environment in real-time and achieves higher robustness and accuracy in state estimation than its current counterparts.
This paper proposes EagerMOT, a simple tracking formulation that eagerly integrates all available object observations from both sensor modalities to obtain a well-informed interpretation of the scene dynamics and achieves state-of-the-art results across several MOT tasks on the KITTI and NuScenes datasets.
This paper proposes a middle-fusion approach to exploit both radar and camera data for 3D object detection and solves the key data association problem using a novel frustum-based method.
Adding a benchmark result helps the community track progress.