3260 papers • 126 benchmarks • 313 datasets
Addition of diacritics for undiacritized arabic texts for words disambiguation.
(Image credit: Papersgraph)
These leaderboards are used to track progress in arabic-text-diacritization-3
Use these libraries to find arabic-text-diacritization-3 models and implementations
No subtasks available.
The results of the experiments show that the neural Shakkala system significantly outperforms traditional rule-based approaches and other closed-source tools with a Diacritic Error Rate (DER) of 2.88% compared with 13.78%, which the best DER for the non-neural approach is obtained by the Mishkal tool.
It is shown that diacritics in Arabic can be used to enhance the models of NLP tasks such as Machine Translation (MT) by proposing the Translation over Diacritization (ToD) approach.
An approach to tackle the problem of the automatic restoration of Arabic diacritics that includes three components stacked in a pipeline: a deep learning model which is a multi-layer recurrent neural network with LSTM and Dense layers, a character-level rule-based corrector which applies deterministic operations to prevent some errors, and a word-level statistical corrector that uses the context and the distance information to fix some diacritical issues.
The design of CAMeL Tools is described and the functionalities it provides are described, including utilities for pre-processing, morphological modeling, Dialect Identification, Named Entity Recognition and Sentiment Analysis.
A novel architecture for labelling character sequences that achieves state-of-the-art results on the Tashkeela Arabic diacritization benchmark using a two-level recurrence hierarchy that operates on the word and character levels separately, enabling faster training and inference than comparable traditional models.
Three deep learning models to recover Arabic text diacritics are proposed based on work in a text-to-speech synthesis system using deep learning, which achieves state-of-the-art performances in both word error rate and diacritic error rate metrics.
Adding a benchmark result helps the community track progress.