Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Data-oriented parsing and the Penn Chinese treebank

Hearne, Mary and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2004) Data-oriented parsing and the Penn Chinese treebank. In: IJCNLP 2004 - 1st International Joint Conference on Natural Language Processing, 22-24 March 2004, Hainan Island, China.

Abstract
We present an investigation into parsing the Penn Chinese Treebank using a Data-Oriented Parsing (DOP) approach. DOP comprises an experience-based approach to natural language parsing. Most published research in the DOP framework uses PStrees as its representation schema. Drawbacks of the DOP approach centre around issues of efficiency. We incorporate recent advances in DOP parsing techniques into a novel DOP parser which generates a compact representation of all subtrees which can be derived from any full parse tree. We compare our work to previous work on parsing the Penn Chinese Treebank, and provide both a quantitative and qualitative evaluation. While our results in terms of Precision and Recall are slightly below those published in related research, our approach requires no manual encoding of head rules, nor is a development phase per se necessary. We also note that certain constructions which were problematic in this previous work can be handled correctly by our DOP parser. Finally, we observe that the ‘DOP Hypothesis’ is confirmed for parsing the Penn Chinese Treebank.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Official URL:http://aclweb.org/mirror/ijcnlp04/
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:15823
Deposited On:23 Nov 2010 14:59 by Shane Harper . Last Modified 16 Nov 2018 11:53
Documents

Full text available as:

[thumbnail of Data-Oriented_Parsing_and_the_Penn_Chinese_Treebank.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
78kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record