3260 papers • 126 benchmarks • 313 datasets
Hand gesture recognition (HGR) is a subarea of Computer Vision where the focus is on classifying a video or image containing a dynamic or static, respectively, hand gesture. In the static case, gestures are also generally called poses. HGR can also be performed with point cloud or joint hand data.
(Image credit: Papersgraph)
These leaderboards are used to track progress in hand-gesture-recognition-8
Use these libraries to find hand-gesture-recognition-8 models and implementations
No subtasks available.
A hierarchical structure enabling offline-working convolutional neural network (CNN) architectures to operate online efficiently by using sliding window approach is proposed by proposing a lightweight CNN architecture to detect gestures and a classifier which is a deep CNN to classify the detected gestures.
A Double-feature Double-motion Network (DD-Net) for skeleton-based action recognition, which can reach a super fast speed, and achieves state-of-the-art performance on experiment datasets: SHREC and JHMDB.
This paper introduces a new large-scale Word-Level American Sign Language (WLASL) video dataset, containing more than 2000 words performed by over 100 signers, and proposes a novel pose-based temporal graph convolution networks (Pose-TGCN) that model spatial and temporal dependencies in human pose trajectories simultaneously, which has further boosted the performance of the pose- based method.
This work proposes a two-stage convolutional neural network architecture for robust recognition of hand gestures, called HGR-Net, where the first stage performs accurate semantic segmentation to determine hand regions, and the second stage identifies the gesture.
This paper proposes a new method of interaction with computing devices having a consumer grade camera that uses two colored markers worn on tips of the fingers to generate desired hand gestures, and for marker detection and tracking the authors used template matching with kalman filter.
This work collects RGB-D video sequences comprised of more than 100K frames of 45 daily hand action categories, involving 26 different objects in several hand configurations, and sees clear benefits of using hand pose as a cue for action recognition compared to other data modalities.
Two new deep models termed as F-BLSTM and F-BGRU are proposed, which can effectively classify the gesture based on analyzing the acceleration and angular velocity data of the human gestures.
This paper proposes a data level fusion strategy, Motion Fused Frames (MFFs), designed to fuse motion information into static images as better representatives of spatio-temporal states of an action.
Adding a benchmark result helps the community track progress.