3260 papers • 126 benchmarks • 313 datasets
Given the transcript of a conversation along with speaker information of each constituent utterance, the ERC task aims to identify the emotion of each utterance from several pre-defined emotions. Formally, given the input sequence of N number of utterances [(u1, p1), (u2, p2), . . . , (uN , pN )], where each utterance ui = [ui,1, ui,2, . . . , ui,T ] consists of T words ui,j and spoken by party pi, the task is to predict the emotion label ei of each utterance ui. .
(Image credit: Papersgraph)
These leaderboards are used to track progress in emotion-recognition-1
Use these libraries to find emotion-recognition-1 models and implementations
No subtasks available.
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.
A simple and efficient baseline for text classification is explored that shows that the fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation.
The Multimodal EmotionLines Dataset (MELD), an extension and enhancement of Emotion lines, contains about 13,000 utterances from 1,433 dialogues from the TV-series Friends and shows the importance of contextual and multimodal information for emotion recognition in conversations.
A LSTM-based model is proposed that enables utterances to capture contextual information from their surroundings in the same video, thus aiding the classification process and showing 5-10% performance improvement over the state of the art and high robustness to generalizability.
Through the graph network, DialogueGCN addresses context propagation issues present in the current RNN-based methods, and empirically show that this method alleviates such issues, while outperforming the current state of the art on a number of benchmark emotion classification datasets.
A new method based on recurrent neural networks that keeps track of the individual party states throughout the conversation and uses this information for emotion classification and outperforms the state of the art by a significant margin on two different datasets.
A deep neural framework, termed Conversational Memory Network (CMN), is proposed, which leverages contextual information from the conversation history to recognize utterance-level emotions in dyadic conversational videos.
A recurrent convolutional neural network is introduced for text classification without human-designed features to capture contextual information as far as possible when learning word representations, which may introduce considerably less noise compared to traditional window-based neural networks.
This paper uses the multi-task learning framework to jointly learn across multiple related tasks based on recurrent neural network to propose three different mechanisms of sharing information to model text with task-specific and shared layers.
Adding a benchmark result helps the community track progress.