3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in vietnamese-social-media-text-processing-9
No benchmarks available.
Use these libraries to find vietnamese-social-media-text-processing-9 models and implementations
No datasets available.
No subtasks available.
The ViHSD - a human-annotated dataset for automatically detecting hate speech on the social network, which contains over 30,000 comments and has one of three labels: CLEAN, OFFENSIVE, or HATE.
Bidirectional Long Short-Term Memory is used to build the model that can predict labels for social media text according to Clean, Offensive, Hate.
A deep learning method based on the Bi-GRU-LSTM-CNN classifier is implemented into the VLSP shared task 2019: Hate Speech Detection on Social Networks with the corpus which contains 20,345 human-labeled comments/posts for training and 5,086 for public-testing.
An efficient pre-processing technique to clean comments collected from Vietnamese social media and a novel hate speech detection (HSD) model, which is the combination of a pre-trained PhoBERT model and a Text-CNN model, was proposed for solving tasks in Vietnamese.
This study presents a novel approach based on contextualized language model (PhoBERT) and graph-based method (Graph Convolutional Networks), ViCGCN, which jointly trained the power of Contextualized embeddings with the ability of GCN to capture more syntactic and semantic dependencies to address those drawbacks of imbalanced data and noisy data on social media.
The first monolingual pre-trained language model for Vietnamese social media texts, ViSoBERT, is presented, which is pre- trained on a large-scale corpus of high-quality and diverse Vietnamese socialMedia texts using XLM-R architecture and surpasses the previous state-of-the-art models on multiple Vietnamese social social media tasks.
This work introduces Vietnamese Lexical Normalization (ViLexNorm), the first-ever corpus developed for the Vietnamese lexical normalization task, which comprises over 10,000 pairs of sentences meticulously annotated by human annotators, sourced from public comments on Vietnam’s most popular social media platforms.
Adding a benchmark result helps the community track progress.