3260 papers • 126 benchmarks • 313 datasets
Keypoint Detection is essential for analyzing and interpreting images in computer vision. It involves simultaneously detecting and localizing interesting points in an image. Keypoints, also known as interest points, are spatial locations or points in the image that define what is interesting or what stands out. They are invariant to image rotation, shrinkage, translation, distortion, etc. Keypoints examples are body joints, facial landmarks, or any other salient points in objects. Keypoints have uses in problems such as pose estimation, object detection and tracking, facial analysis, and augmented reality. ( Image credit: PifPaf: Composite Fields for Human Pose Estimation; "Learning to surf" by fotologic, license: CC-BY-2.0 )
(Image credit: Papersgraph)
These leaderboards are used to track progress in keypoint-detection-7
Use these libraries to find keypoint-detection-7 models and implementations
No subtasks available.
This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.
The center point based approach, CenterNet, is end-to-end differentiable, simpler, faster, and more accurate than corresponding bounding box based detectors and performs competitively with sophisticated multi-stage methods and runs in real-time.
This work presents an approach to efficiently detect the 2D pose of multiple people in an image using a nonparametric representation, which it refers to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image.
OpenPose is released, the first open-source realtime system for multi-person 2D pose detection, including body, foot, hand, and facial keypoints, and the first combined body and foot keypoint detector, based on an internal annotated foot dataset.
An approach that uses a multi-camera system to train fine-grained detectors for keypoints that are prone to occlusion, such as the joints of a hand, and derives a result analytically relating the minimum number of views to achieve target true and false positive rates for a given detector.
This paper proposes a network that maintains high-resolution representations through the whole process of human pose estimation and empirically demonstrates the effectiveness of the network through the superior pose estimation results over two benchmark datasets: the COCO keypoint detection dataset and the MPII Human Pose dataset.
This paper presents non-local operations as a generic family of building blocks for capturing long-range dependencies in computer vision and improves object detection/segmentation and pose estimation on the COCO suite of tasks.
The goal of this paper is to advance the state-of-the-art of articulated pose estimation in scenes with multiple people. To that end we contribute on three fronts. We propose (1) improved body part detectors that generate effective bottom-up proposals for body parts; (2) novel image-conditioned pairwise terms that allow to assemble the proposals into a variable number of consistent body part configurations; and (3) an incremental optimization strategy that explores the search space more efficiently thus leading both to better performance and significant speed-up factors. Evaluation is done on two single-person and two multi-person pose estimation benchmarks. The proposed approach significantly outperforms best known multi-person pose estimation results while demonstrating competitive performance on the task of single person pose estimation (Models and code available at http://pose.mpi-inf.mpg.de).
This work provides simple and effective baseline methods for pose estimation that are helpful for inspiring and evaluating new ideas for the field and achieved on challenging benchmarks.
This paper proposes a novel regional multi-person pose estimation (RMPE) framework to facilitate pose estimation in the presence of inaccurate human bounding boxes and can achieve 76:7 mAP on the MPII (multi person) dataset.
Adding a benchmark result helps the community track progress.