Wagner, JoachimORCID: 0000-0002-8290-3849 and Foster, JenniferORCID: 0000-0002-7789-4853
(2021)
Revisiting tri-training of dependency parsers.
In: 2021 Conference on Empirical Methods in Natural Language Processing, 7-11 Nov 2021, Online and Punta Cana, Dominican Republic.
We compare two orthogonal semi-supervised learning techniques, namely tri-training and pretrained word embeddings, in the task of dependency parsing. We explore language-specific FastText and ELMo embeddings and multilingual BERT embeddings. We focus on a low resource scenario as semi-supervised learning can be expected to have the most impact here. Based on treebank size and available ELMo models, we select Hungarian, Uyghur (a zero-shot language for mBERT) and Vietnamese. Furthermore, we include English in a simulated low-resource setting. We find that pretrained word embeddings make more effective use of unlabelled data than tri-training but that the two approaches can be successfully combined.
Science Foundation Ireland (Grant 13/RC/2106), European Regional Development Fund, Science Foundation Ireland SFI Frontiers for the Future programme (19/FFP/6942).
ID Code:
28292
Deposited On:
28 Apr 2023 08:35 by Joachim Wagner. Last Modified 28 Apr 2023 08:35