A crowd-sourcing approach for translations of minority 
language user-generated content (UGC)

Dowling, Meghan; Lynn, Teresa; Way, Andy

Dowling, Meghan ORCID: 0000-0003-1637-4923, Lynn, Teresa and Way, Andy ORCID: 0000-0001-5736-5930 (2017) A crowd-sourcing approach for translations of minority language user-generated content (UGC). In: First workshop on Social Media and User Generated Content Machine Translation, 31 May 2017, Prague, Czech Republic.

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

Data sparsity is a common problem for machine translation of minority and less-resourced languages. While data collection for standard, grammatical text can be challenging enough, efforts for collection of parallel user-generated content can be even more challenging. In this paper we describe an approach to collecting English↔Irish translations of user-generated content (tweets) that overcomes some of these hurdles. We show how a crowd-sourced data collection campaign, which was tailored to our target audience (the Irish language community), proved successful in gathering data for a niche domain. We also discuss the reliability of crowd-sourcing English↔Irish tweet translations in terms of quality by reporting on a self-rating approach along with qualified reviewer ratings.

Metadata

Item Type:	Conference or Workshop Item (Paper)
Event Type:	Conference
Refereed:	Yes
Uncontrolled Keywords:	Minority Languages;
Subjects:	Computer Science > Machine translating Humanities > Irish language
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT
Copyright Information:	© 2017 PBML. Distributed under CC BY-NC-ND
Funders:	ADAPT Centre for Digital Content Technology, which is funded under the SFI Research Centres Programme (Grant 13/RC/2016) and is co-funded by the European Regional Development Fund
ID Code:	23304
Deposited On:	20 May 2019 15:50 by INVALID USER. Last Modified 20 May 2019 15:50

Documents

Full text available as:

[thumbnail of A_Crowd-sourcing_Approach_for_Translations_of_Minority_Language_User-Generated_Content[1] (1).pdf]

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
221kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

A crowd-sourcing approach for translations of minority language user-generated content (UGC)

Downloads