Wagner, Joachim ORCID: 0000-0002-8290-3849, Barry, James ORCID: 0000-0003-3051-585X and Foster, Jennifer ORCID: 0000-0002-7789-4853 (2020) Treebank embedding vectors for out-of-domain dependency parsing. In: 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020), 05-10 Jul 2020, Online (virtual conference).
Abstract
A recent advance in monolingual dependency parsing is the idea of a treebank embedding vector, which allows all treebanks for a particular language to be used as training data while at the same time allowing the model to prefer training data from one treebank over others and to select the preferred treebank at test time. We build on this idea by 1) introducing a method to predict a treebank vector for sentences that do not come from a treebank used in training, and 2) exploring what happens when we move away from predefined treebank embedding vectors during test time and instead devise tailored interpolations. We show that 1) there are interpolated vectors that are superior to the predefined ones, and 2) treebank vectors can be predicted with sufficient accuracy, for nine out of ten test languages, to match the performance of an oracle approach that knows the most suitable predefined treebank embedding for the test set.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | dependency parsing |
Subjects: | Computer Science > Computational linguistics |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT |
Published in: | Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. . Association for Computational Linguistics (ACL). |
Publisher: | Association for Computational Linguistics (ACL) |
Official URL: | http://dx.doi.org/10.18653/v1/2020.acl-main.778 |
Copyright Information: | © 2020 The Authors CC-BY-4.0 |
Funders: | Science Foundation Ireland (SFI) Research Centres Programme (Grant 13/RC/2106), European Regional Development Fund |
ID Code: | 24861 |
Deposited On: | 27 Aug 2020 14:05 by Joachim Wagner . Last Modified 27 Aug 2020 14:05 |
Documents
Full text available as:
Preview |
PDF (Publisher version under Creative Commons Attribution 4.0 International License.)
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1MB |
Metrics
Altmetric Badge
Dimensions Badge
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record