3260 papers • 126 benchmarks • 313 datasets
Pose Tracking is the task of estimating multi-person human poses in videos and assigning unique instance IDs for each keypoint across frames. Accurate estimation of human keypoint-trajectories is useful for human action recognition, human interaction understanding, motion capture and animation. Source: LightTrack: A Generic Framework for Online Top-Down Human Pose Tracking
(Image credit: Papersgraph)
These leaderboards are used to track progress in pose-tracking-7
Use these libraries to find pose-tracking-7 models and implementations
This paper proposes a network that maintains high-resolution representations through the whole process of human pose estimation and empirically demonstrates the effectiveness of the network through the superior pose estimation results over two benchmark datasets: the COCO keypoint detection dataset and the MPII Human Pose dataset.
This work provides simple and effective baseline methods for pose estimation that are helpful for inspiring and evaluating new ideas for the field and achieved on challenging benchmarks.
BlazePose is presented, a lightweight convolutional neural network architecture for human pose estimation that is tailored for real-time inference on mobile devices that uses both heatmaps and regression to keypoint coordinates.
This work tackles the problem of event-based camera localization in a known environment, without additional sensing, using a probabilistic generative event model in a Bayesian filtering framework and proposes to use the contrast residual as a measure of how well the estimated pose of the event- based camera and the environment explain the observed events.
This work proposes a novel method that jointly models multi-person pose estimation and tracking in a single formulation and introduces a challenging Multi-Person PoseTrack dataset, and proposes a completely unconstrained evaluation protocol that does not make any assumptions about the scale, size, location or the number of persons.
PoseTrack is a new large-scale benchmark for video-based human pose estimation and articulated tracking that conducts an extensive experimental study on recent approaches to articulated pose tracking and provides analysis of the strengths and weaknesses of the state of the art.
This work proposes a framework for hand tracking that can capture the motion of two interacting hands using only a single, inexpensive RGB-D camera, and combines a generative model with collision detection and discriminatively learned salient points.
It is shown that mgPFF is able to not only estimate long-range flow for frame reconstruction and detect video shot transitions, but also readily amendable for video object segmentation and pose tracking, where it substantially outperforms the published state-of-the-art without bells and whistles.
A Siamese Graph Convolution Network is proposed for human pose matching as a Re-ID module, which uses a graphical representation of human joints for matching and is robust to sudden camera shifts that introduce human drifting.
6-PACK learns to compactly represent an object by a handful of 3D keypoints, based on which the interframe motion of an object instance can be estimated through keypoint matching, and substantially outperforms existing methods on the NOCS category-level 6D pose estimation benchmark.
Adding a benchmark result helps the community track progress.