3260 papers • 126 benchmarks • 313 datasets
Grammatical Error Detection (GED) is the task of detecting different kinds of errors in text such as spelling, punctuation, grammatical, and word choice errors. Grammatical error detection (GED) is one of the key component in grammatical error correction (GEC) community.
(Image credit: Papersgraph)
These leaderboards are used to track progress in grammatical-error-detection-11
Use these libraries to find grammatical-error-detection-11 models and implementations
No subtasks available.
A sequence labeling framework with a secondary training objective, learning to predict surrounding words for every word in the dataset, which incentivises the system to learn general-purpose patterns of semantic and syntactic composition, useful for improving accuracy on different sequence labeling tasks.
This paper investigates how objectives at different granularities can be used to learn better language representations and proposes an architecture for jointly learning to label sentences and tokens and achieves substantial improvements for both sentence classification and sequence labeling.
Grammatical Error Correction (GEC) has been broadly applied in automatic correction and proofreading system recently. However, it is still immature in Chinese GEC due to limited high-quality data from native speakers in terms of category and scale. In this paper, we present FCGEC, a fine-grained corpus to detect, identify and correct the grammatical errors. FCGEC is a human-annotated corpus with multiple references, consisting of 41,340 sentences collected mainly from multi-choice questions in public school Chinese examinations. Furthermore, we propose a Switch-Tagger-Generator (STG) baseline model to correct the grammatical errors in low-resource settings. Compared to other GEC benchmark models, experimental results illustrate that STG outperforms them on our FCGEC. However, there exists a significant gap between benchmark models and humans that encourages future models to bridge it.
The T5 model was primarily designed for translation and is not specifically designed for this task, so extensive post-processing was necessary to adapt it to the task of error detection, and it can achieve low Levenshtein Distance.
A novel evaluation method for grammatical error correction that addresses problems with previous approaches and scores systems in terms of improvement on the original text by evaluating corrections at the token level using a globally optimal alignment between the source, a system hypothesis, and a reference.
A bidirectional long-short term memory model initialized by word embeddings achieved the state-of-the-art accuracy by a large margin in an English grammatical error detection task on the First Certificate in English dataset.
This work investigates cheaply constructing synthetic samples, given a small corpus of human-annotated data, using an off-the-rack attentive sequence-to-sequence model and a straight-forward post-processing procedure, and yields error-filled artificial data that helps a vanilla bi-directional LSTM to outperform the previous state of the art at grammatical error detection.
Estimated human attention derived from eye-tracking corpora is used to regularize attention functions in recurrent neural networks and shows substantial improvements across a range of tasks, including sentiment analysis, grammatical error detection, and detection of abusive language.
A systematic comparison of ELMo, BERT and Flair embeddings on a range of public GED datasets are performed, and an approach to effectively integrate such representations in current methods is proposed, achieving a new state of the art on GED.
Adding a benchmark result helps the community track progress.