3260 papers • 126 benchmarks • 313 datasets
Group Activity Recognition is a subset of human activity recognition problem which focuses on the collective behavior of a group of people, resulted from the individual actions of the persons and their interactions. Collective activity recognition is a basic task for automatic human behavior analysis in many areas like surveillance or sports videos. Source: A Multi-Stream Convolutional Neural Network Framework for Group Activity Recognition
(Image credit: Papersgraph)
These leaderboards are used to track progress in group-activity-recognition-5
Use these libraries to find group-activity-recognition-5 models and implementations
No subtasks available.
PoseConv3D is more effective in learning spatiotemporal features, more robust against pose estimation noises, and generalizes better in cross-dataset settings, and achieves the state-of-the-art on all eight multi-modality action recognition benchmarks.
This paper proposes to build a flexible and efficient Actor Relation Graph (ARG) to simultaneously capture the appearance and position relation between actors, and performs extensive experiments on two standard group activity recognition datasets.
The proposed Dynamic Inference Network (DIN), which composes of Dynamic Relation module and Dynamic Walk module, achieves significant improvement compared to previous state-of-the-art methods on two popular datasets under the same setting, while costing much less computation overhead of the reasoning module.
An approach for classifying the activity performed by a group of people in a video sequence based on LSTM (long short-term memory) models and a 2-stage deep temporal model for the group activity recognition problem are presented.
A Hierarchical Relational Network that computes relational representations of people, given graph structures describing potential interactions, is presented that learns relational feature representations that can effectively discriminate person and group activity classes.
This paper shows that using only skeletal data the authors can train a state-of-the art end-to-end system using only group activity labels at the sequence level and shows that pseudo-labels can be computed from any pre-trained feature extractor with comparable final performance.
This paper proposes joint learning of individual action recognition and people grouping for improving group activity recognition. By sharing the information between two similar tasks (i.e., individual action recognition and people grouping) through joint learning, errors of these two tasks are mutually corrected. This joint learning also improves the accuracy of group activity recognition. Our proposed method is designed to consist of any individual action recognition methods as a component. The effectiveness is validated with various IAR methods. By employing existing group activity recognition methods for ensembling with the proposed method, we achieved the best performance compared to the similar SOTA group activity recognition methods.
Adding a benchmark result helps the community track progress.