Cross-lingual transfer parsing for low-resourced languages: an Irish case study
Lynn, Teresa, Foster, JenniferORCID: 0000-0002-7789-4853, Dras, MarkORCID: 0000-0001-9908-7182 and Tounsi, Lamia
(2014)
Cross-lingual transfer parsing for low-resourced languages: an Irish case study.
In: First Celtic Language Technology Workshop, 23 Aug 2014, Dublin, Ireland.
We present a study of cross-lingual direct transfer parsing for the Irish language. Firstly we
discuss mapping of the annotation scheme of the Irish Dependency Treebank to a universal dependency scheme. We explain our dependency label mapping choices and the structural changes
required in the Irish Dependency Treebank. We then experiment with the universally annotated
treebanks of ten languages from four language family groups to assess which languages are the
most useful for cross-lingual parsing of Irish by using these treebanks to train delexicalised parsing models which are then applied to sentences from the Irish Dependency Treebank. The best
results are achieved when using Indonesian, a language from the Austronesian language family.
Judge, John and Lynn, Teresa and Ward, Monica and Ó Raghallaigh, Brian, (eds.)
Proceedings of the First Celtic Language Technology Workshop.
.
Association for Computational Linguistics and Dublin City University.
Publisher:
Association for Computational Linguistics and Dublin City University