DeepGrammar: A Neural Network Approach for Grammatical Error Detection and Correction
DeepGrammar is a bidirectional GRU‑based neural model that detects subject‑verb agreement errors by encoding surrounding context into fixed‑length vectors, outperforming rule‑based, classifier, and NMT approaches on the CoNLL‑2014 benchmark and achieving state‑of‑the‑art results across multiple error types.
Overview
Language learners frequently make grammatical errors in spoken and written expression, so automatic grammatical error correction can greatly assist them. Research on automatic grammar correction has a history of decades, with shared tasks in CoNLL‑2013 and CoNLL‑2014. Existing methods fall into three categories: rule‑based, classifier‑based (e.g., maximum‑entropy classifiers for article errors), and translation‑model‑based (including neural machine translation).
Model Overview
FluentSpeak, a language‑education technology company, developed DeepGrammar, a neural‑network‑based grammar error detection system, illustrated with subject‑verb agreement detection. In English, verb choice depends heavily on surrounding context; DeepGrammar encodes the context with a bidirectional GRU, producing a fixed‑length vector that predicts the correct verb form. A mismatch indicates an error.
DeepGrammar’s architecture consists of two GRU networks processing the left and right contexts; their outputs are concatenated into a context vector, which is fed to a multilayer perceptron (MLP) and a softmax layer to predict verb morphology. The training loss is cross‑entropy.
Unlike traditional classifiers that require extensive feature engineering or NMT approaches that need large annotated corpora, DeepGrammar learns semantic representations directly from native corpora and can be extended to other error types such as noun number, article, and preposition errors.
Performance
Examples demonstrate DeepGrammar’s ability to correct both short‑distance and long‑distance dependencies. Compared with other published results, DeepGrammar achieves the best reported scores on multiple error categories in the CoNLL‑2014 test set, surpassing the previous state‑of‑the‑art classifier (Rozovskaya et al.) and NMT (Xie et al.) methods.
Wrong Sentences
Correction Sentences
he might end up
dishearten
his family
he might end up
disheartening
his family
... negative impacts
to
the family
... negative impacts
on
the family
The popularity of social media sites
have
made ...
The popularity of social media sites
has
made ...
Having support from relatives
are
vital
Having support from relatives
is
vital
... after realising his or her
conditions
... after realising his or her
condition
.
Especially for
the
young people without marriage
Especially for young people without marriage
the government
encourage
people to give more birth
the government
encourages
people to give more birth
Further comparison with state‑of‑the‑art techniques on the CoNLL‑2014 test set shows DeepGrammar outperforming both the top classifier‑based system (Rozovskaya et al.) and the leading NMT system (Xie et al.).
DeepGrammar has been deployed in several of FluentSpeak’s products. For more details, see the original paper presented at InterSpeech 2017 titled “Deep Context Model for Grammatical Error Correction”.
The algorithm team is hiring engineers with deep‑learning experience; research areas include speech recognition, synthesis, adaptive learning, dialogue systems, and natural language understanding. Join us to revolutionize education with AI .
Liulishuo Tech Team
Help everyone become a global citizen!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.