3260 papers • 126 benchmarks • 313 datasets
Reinforcement learning / robot learning from 3D point clouds
(Image credit: Papersgraph)
These leaderboards are used to track progress in 3d-point-cloud-reinforcement-learning-3
No benchmarks available.
Use these libraries to find 3d-point-cloud-reinforcement-learning-3 models and implementations
No datasets available.
No subtasks available.
We study how choices of input point cloud coordinate frames impact learning of manipulation skills from 3D point clouds. There exist a variety of coordinate frame choices to normalize captured robot-object-interaction point clouds. We find that different frames have a profound effect on agent learning performance, and the trend is similar across 3D backbone networks. In particular, the end-effector frame and the target-part frame achieve higher training efficiency than the commonly used world frame and robot-base frame in many tasks, intuitively because they provide helpful alignments among point clouds across time steps and thus can simplify visual module learning. Moreover, the well-performing frames vary across tasks, and some tasks may benefit from multiple frame candidates. We thus propose FrameMiners to adaptively select candidate frames and fuse their merits in a task-agnostic manner. Experimentally, FrameMiners achieves on-par or significantly higher performance than the best single-frame version on five fully physical manipulation tasks adapted from ManiSkill and OCRTOC. Without changing existing camera placements or adding extra cameras, point cloud frame mining can serve as a free lunch to improve 3D manipulation learning.
3D point cloud RL can significantly outperform the 2D counterpart when agent-object / object-object relationship encoding is a key factor, leading to the development of a robust algorithm for various robotic manipulation and control tasks.
Adding a benchmark result helps the community track progress.