3260 papers • 126 benchmarks • 313 datasets
Action unit detection is the task of detecting action units from a video - for example, types of facial action units (lip tightening, cheek raising) from a video of a face. ( Image credit: AU R-CNN )
(Image credit: Papersgraph)
These leaderboards are used to track progress in action-unit-detection
Use these libraries to find action-unit-detection models and implementations
No subtasks available.
This work trains a unified model to perform three tasks: facial action unit detection, expression classification, and valence-arousal estimation, and proposes an algorithm for the multitask model to learn from missing (incomplete) labels.
This work proposes a novel convolutional neural network approach to address the fine-grained recognition problem of multi-view dynamic facial action unit detection by formulating the task of predicting the presence or absence of a specific action unit in a still image of a human face as holistic classification.
A novel end-to-end deep learning framework for joint AU detection and face alignment, which has not been explored before, in which multi-scale shared features are learned firstly, and high-level features of face alignment are fed into AU detection.
This paper proposes an end-to-end unconstrained facial AU detection framework based on domain adaptation, which transfers accurate AU labels from a constrained source domain to an unconStrained target domain by exploiting labels of AU-related facial landmarks.
The goal is to describe facial expressions in a continuous fashion using a compact embedding space that mimics human visual preferences, and it is shown that the embedding learned using the proposed dataset performs better than several other embeddings learned using existing emotion or action unit datasets.
In this paper, we consider the problem of real-time video-based facial emotion analytics, namely, facial expression recognition, prediction of valence and arousal and detection of action unit points. We propose the novel frame-level emotion recognition algorithm by extracting facial features with the single EfficientNet model pre-trained on Affect-Net. The predictions for sequential frames are smoothed using mean or median filters. It is demonstrated that our approach may be implemented even for video analytics on mobile devices. Experimental results for the large scale AffWild2 database from the third Affective Behavior Analysis in-the-wild Competition demonstrate that our simple model is significantly better when compared to the VggFace baseline. In particular, our method is characterized by 0.1-0.5 higher performance measures for test sets in the uni-task Expression Classification, Valence-Arousal Estimation, Action Unit Detection and Multi-Task Learning. Our team took the 3rd place in the multi-task learning challenge and 4th places in Valence-Arousal and Expression challenges. Due to simplicity, the proposed approach may be considered as a new baseline for all four sub-challenges.
In this paper, we aim to learn discriminative representation for facial action unit (AU) detection from large amount of videos without manual annotations. Inspired by the fact that facial actions are the movements of facial muscles, we depict the movements as the transformation between two face images in different frames and use it as the self-supervisory signal to learn the representations. However, under the uncontrolled condition, the transformation is caused by both facial actions and head motions. To remove the influence by head motions, we propose a Twin-Cycle Autoencoder (TCAE) that can disentangle the facial action related movements and the head motion related ones. Specifically, TCAE is trained to respectively change the facial actions and head poses of the source face to those of the target face. Our experiments validate TCAE's capability of decoupling the movements. Experimental results also demonstrate that the learned representation is discriminative for AU detection, where TCAE outperforms or is comparable with the state-of-the-art self-supervised learning methods and supervised AU detection methods.
A novel self-adjusting AU-correlation learning (SACL) method with less computation for AU detection that outperforms the state-of-the-art methods on widely used AU detection benchmark datasets and obtains a more robust feature representation for the final AU detection.
Adding a benchmark result helps the community track progress.