Hassan, Hany, Sima'an, Khalil and Way, Andy ORCID: 0000-0001-5736-5930 (2009) Lexicalized semi-incremental dependency parsing. In: RANLP-2009 - Recent Advances in Natural Language Processing, 14-16 September 2009, Borovets, Bulgaria.
Abstract
Even leaving aside concerns of cognitive plausibility,
incremental parsing is appealing for applications such
as speech recognition and machine translation because
it could allow for incorporating syntactic features into
the decoding process without blowing up the search
space. Yet, incremental parsing is often associated
with greedy parsing decisions and intolerable loss of
accuracy. Would the use of lexicalized grammars provide
a new perspective on incremental parsing? In this paper we explore incremental left-to-right dependency parsing using a lexicalized grammatical formalism that works with lexical categories (supertags) and a small set of combinatory operators. A strictly incremental parser would conduct only a single pass over the input, use no lookahead and make only local decisions at every word. We show that such a parser suffers heavy loss of accuracy. Instead, we explore
the utility of a two-pass approach that incrementally
builds a dependency structure by first assigning a supertag
to every input word and then selecting an incremental
operator that allows assembling every supertag with the dependency structure built so-far to its left. We instantiate this idea in different models that allow
a trade-off between aspects of full incrementality
and performance, and explore the differences between
these models empirically. Our exploration shows that
a semi-incremental (two-pass), linear-time parser that
employs fixed and limited look-ahead exhibits an appealing
balance between the efficiency advantages of incrementality and the achieved accuracy. Surprisingly, taking local or global decisions matters very little for the accuracy of this linear-time parser. Such a parser fits seemlessly with the currently dominant finite-state decoders for machine translation.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Subjects: | Computer Science > Machine translating |
DCU Faculties and Centres: | Research Institutes and Centres > National Centre for Language Technology (NCLT) |
Official URL: | http://lml.bas.bg/ranlp2009/ |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
ID Code: | 15186 |
Deposited On: | 16 Feb 2010 11:33 by DORAS Administrator . Last Modified 14 Nov 2018 16:28 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
115kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record