3260 papers • 126 benchmarks • 313 datasets
What is Human Pose Estimation? Human pose estimation is the process of estimating the configuration of the body (pose) from a single, typically monocular, image. Background. Human pose estimation is one of the key problems in computer vision that has been studied for well over 15 years. The reason for its importance is the abundance of applications that can benefit from such a technology. For example, human pose estimation allows for higher-level reasoning in the context of human-computer interaction and activity recognition; it is also one of the basic building blocks for marker-less motion capture (MoCap) technology. MoCap technology is useful for applications ranging from character animation to clinical analysis of gait pathologies.
(Image credit: Papersgraph)
These leaderboards are used to track progress in 2d-human-pose-estimation-3
Use these libraries to find 2d-human-pose-estimation-3 models and implementations
This work presents an approach to efficiently detect the 2D pose of multiple people in an image using a nonparametric representation, which it refers to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image.
This paper proposes a network that maintains high-resolution representations through the whole process of human pose estimation and empirically demonstrates the effectiveness of the network through the superior pose estimation results over two benchmark datasets: the COCO keypoint detection dataset and the MPII Human Pose dataset.
HigherHRNet is presented, a novel bottom-up human pose estimation method for learning scale-aware representations using high-resolution feature pyramids that surpasses all top-down methods on CrowdPose test and achieves new state-of-the-art result on COCO test-dev, suggesting its robustness in crowded scene.
This work provides simple and effective baseline methods for pose estimation that are helpful for inspiring and evaluating new ideas for the field and achieved on challenging benchmarks.
This paper proposes a novel regional multi-person pose estimation (RMPE) framework to facilitate pose estimation in the presence of inaccurate human bounding boxes and can achieve 76:7 mAP on the MPII (multi person) dataset.
Results on a number of difficult continuous-control tasks show that the developed notion of sub-optimality of a representation, defined in terms of expected reward of the optimal hierarchical policy using this representation, yields qualitatively better representations as well as quantitatively better hierarchical policies compared to existing methods.
A well-known bottom-up approach for multi-person pose estimation is rethink and an improved one is proposed which outperforms the baseline by about 15% in average precision and is comparable to the state of the art on the MS-COCO test-dev dataset.
BlazePose is presented, a lightweight convolutional neural network architecture for human pose estimation that is tailored for real-time inference on mobile devices that uses both heatmaps and regression to keypoint coordinates.
This paper introduces a new benchmark "Occluded Human (OCHuman)", which focuses on occluded humans with comprehensive annotations including bounding-box, human pose and instance masks, and demonstrates that this pose-based framework can achieve better accuracy than the state-of-art detection-based approach on the human instance segmentation problem, and can moreover better handle occlusion.
Associative embedding is introduced, a novel method for supervising convolutional neural networks for the task of detection and grouping for multi-person pose estimation and state-of-the-art performance on the MPII and MS-COCO datasets is reported.
Adding a benchmark result helps the community track progress.