Browse DORAS
Browse Theses
Latest Additions
Creative Commons License
Except where otherwise noted, content on this site is licensed for use under a:

A comparative evaluation of deep and shallow approaches to the automatic detection of common grammatical errors

Wagner, Joachim and Foster, Jennifer and van Genabith, Josef (2007) A comparative evaluation of deep and shallow approaches to the automatic detection of common grammatical errors. In: EMNLP-CoNLL 2007 - Joint Meeting of the Conference on Empirical Methods in Natural Language Processing and the Conference on Computational Natural Language Learning, 28-30 June 2007, Prague, Czech Republic.

Full text available as:

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


This paper compares a deep and a shallow processing approach to the problem of classifying a sentence as grammatically wellformed or ill-formed. The deep processing approach uses the XLE LFG parser and English grammar: two versions are presented, one which uses the XLE directly to perform the classification, and another one which uses a decision tree trained on features consisting of the XLE’s output statistics. The shallow processing approach predicts grammaticality based on n-gram frequency statistics: we present two versions, one which uses frequency thresholds and one which uses a decision tree trained on the frequencies of the rarest n-grams in the input sentence. We find that the use of a decision tree improves on the basic approach only for the deep parser-based approach. We also show that combining both the shallow and deep decision tree features is effective. Our evaluation is carried out using a large test set of grammatical and ungrammatical sentences. The ungrammatical test set is generated automatically by inserting grammatical errors into well-formed BNC sentences.

Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Uncontrolled Keywords:error detection;
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:Research Initiatives and Centres > National Centre for Language Technology (NCLT)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Publisher:Association for Computational Linguistics
Official URL:
Copyright Information:© 2007 Association for Computational Linguistics
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Irish Research Council for Science Engineering and Technology, IRCSET SC/02/298, IRCSET P/04/232
ID Code:15214
Deposited On:17 Feb 2010 16:55 by DORAS Administrator. Last Modified 27 Apr 2010 15:08

Download statistics

Archive Staff Only: edit this record