3260 papers • 126 benchmarks • 313 datasets
This task aims to solve absolute 3D multi-person pose Estimation (camera-centric coordinates). No ground truth human bounding box and human root joint coordinates are used during testing stage. ( Image credit: RootNet )
(Image credit: Papersgraph)
These leaderboards are used to track progress in 3d-multi-person-pose-estimation-absolute
Use these libraries to find 3d-multi-person-pose-estimation-absolute models and implementations
No subtasks available.
This work proposes a new single-shot method for multi-person 3D pose estimation in general scenes from a monocular RGB camera which uses novel occlusion-robust pose-maps (ORPM) which enable full body pose inference even under strong partial occlusions by other people and objects in the scene.
A weakly-supervised transfer learning method that uses mixed 2D and 3D labels in a unified deep neutral network that presents two-stage cascaded structure to regularize the 3D pose prediction, which is effective in the absence of ground truth depth labels.
A new density-based clustering algorithm, <italic>RNN-DBSCAN</italic>, is presented which uses reverse nearest neighbor counts as an estimate of observation density. Clustering is performed using a <italic>DBSCAN</italic>-like approach based on <inline-formula><tex-math notation="LaTeX">$k$</tex-math><alternatives> <inline-graphic xlink:href="bryant-ieq1-2787640.gif"/></alternatives></inline-formula> nearest neighbor graph traversals through dense observations. <italic>RNN-DBSCAN</italic> is preferable to the popular density-based clustering algorithm <italic>DBSCAN</italic> in two aspects. First, problem complexity is reduced to the use of a single parameter (choice of <inline-formula><tex-math notation="LaTeX">$k$</tex-math><alternatives> <inline-graphic xlink:href="bryant-ieq2-2787640.gif"/></alternatives></inline-formula> nearest neighbors), and second, an improved ability for handling large variations in cluster density (heterogeneous density). The superiority of <italic>RNN-DBSCAN</italic> is demonstrated on several artificial and real-world datasets with respect to prior work on reverse nearest neighbor based clustering approaches (<italic>RECORD</italic>, <italic>IS-DBSCAN</italic>, and <italic> ISB-DBSCAN</italic>) along with <italic>DBSCAN</italic> and <italic>OPTICS</italic>. Each of these clustering approaches is described by a common graph-based interpretation wherein clusters of dense observations are defined as connected components, along with a discussion on their computational complexity. Heuristics for <italic>RNN-DBSCAN </italic> parameter selection are presented, and the effects of <inline-formula><tex-math notation="LaTeX">$k$ </tex-math><alternatives><inline-graphic xlink:href="bryant-ieq3-2787640.gif"/></alternatives></inline-formula> on <italic>RNN-DBSCAN</italic> clusterings discussed. Additionally, with respect to scalability, an approximate version of <italic>RNN-DBSCAN</italic> is presented leveraging an existing approximate <inline-formula><tex-math notation="LaTeX"> $k$</tex-math><alternatives><inline-graphic xlink:href="bryant-ieq4-2787640.gif"/></alternatives></inline-formula> nearest neighbor technique.
This work firstly proposes a fully learning-based, camera distance-aware top-down approach for 3D multi-person pose estimation from a single RGB image, which achieves comparable results with the state-of-the-art 3D single- person pose estimation models without any groundtruth information and significantly outperforms previous 3DMulti-Person pose estimation methods on publicly available datasets.
The Human Depth Estimation Network (HDNet), an end-to-end framework for absolute root joint localization in the camera coordinate space, is proposed and shown to outperform the previous state-of-the-art consistently under multiple evaluation metrics.
A novel system that first regresses a set of 2.5D representations of body parts and then reconstructs the 3D absolute poses based on these 2.
This work introduces a network that can be trained with additional RGB-D images in a weakly supervised fashion, and achieves state-of-the-art results on the MuPoTS-3D dataset by a considerable margin.
This work proposes a novel framework integrating graph convolutional networks (GCNs), which unlike the existing GCN, is based on a directed graph that employs the 2D pose estimator's confidence scores to improve the pose estimation results.
The proposed Distribution-Aware Single-stage DAS model simultaneously localizes person positions and their corresponding body joints in the 3D camera space in a one-pass manner, leading to a simplified pipeline with enhanced efficiency and its stat-of-the-art accuracy for multi-person 3D pose estimation.
This work proposes a two-person pose discriminator that enforces natural two- person interactions and applies a semi-supervised method to overcome the 3D ground-truth data scarcity.
Adding a benchmark result helps the community track progress.