Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Cross-lingual parsing with polyglot training and multi-treebank learning: a Faroese case study

Barry, James orcid logoORCID: 0000-0003-3051-585X, Wagner, Joachim orcid logoORCID: 0000-0002-8290-3849 and Foster, Jennifer orcid logoORCID: 0000-0002-7789-4853 (2019) Cross-lingual parsing with polyglot training and multi-treebank learning: a Faroese case study. In: The 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), 3 - 5 Nov 2019, Hong Kong, China. ISBN 978-1-950737-78-9

Abstract
Cross-lingual dependency parsing involves transferring syntactic knowledge from one language to another. It is a crucial component for inducing dependency parsers in low-resource scenarios where no training data for a language exists. Using Faroese as the target language, we compare two approaches using annotation projection: first, projecting from multiple monolingual source models; second, projecting from a single polyglot model which is trained on the combination of all source languages. Furthermore, we reproduce multi-source projection (Tyers et al., 2018), in which dependency trees of multiple sources are combined. Finally, we apply multi-treebank modelling to the projected treebanks, in addition to or alternatively to polyglot modelling on the source side. We find that polyglot training on the source languages produces an overall trend of better results on the target language but the single best result for the target language is obtained by projecting from monolingual source parsing models and then training multi-treebank POS tagging and parsing models on the target side.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Workshop
Refereed:Yes
Uncontrolled Keywords:Faroese language resources; knowledge transfer across related languages; dependency parsing for low-resource languages
Subjects:Computer Science > Computational linguistics
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > ADAPT
Published in: Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019). . Association for Computational Linguistics. ISBN 978-1-950737-78-9
Publisher:Association for Computational Linguistics
Official URL:http://dx.doi.org/10.18653/v1/D19-6118
Copyright Information:© 2019 The Association for Computational Linguistics (ACL) CC-BY-4.0
Funders:Science Foundation Ireland SFI Research Centres Programme (Grant 13/RC/2106), European Regional Development Fund
ID Code:23972
Deposited On:29 Nov 2019 15:58 by Joachim Wagner . Last Modified 27 Aug 2020 14:12
Documents

Full text available as:

[thumbnail of paper-2019-10-01-v60-2.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
196kB
Metrics

Altmetric Badge

Dimensions Badge

Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record