Translation crowdsourcing:
creating a multilingual corpus of online educational content
Sosoni, VilelminiORCID: 0000-0002-9583-4651, Kermanidis, Katia LidaORCID: 0000-0002-3270-5078, Stasimioti, MariaORCID: 0000-0002-3270-5078, Naskos, Thanasis, Takoulidou, Eirini, van Zaanen, Menno, Castilho, SheilaORCID: 0000-0002-8416-6555, Georgakopoulou, PanayotaORCID: 0000-0001-9780-1813, Kordoni, Valia and Egg, Markus
(2018)
Translation crowdsourcing:
creating a multilingual corpus of online educational content.
In: 11th International Conference on Language Resources and Evaluation, 7-12 May 2018, Miyazaki, Japan.
ISBN 979-10-95546-00-9
The present work describes a multilingual corpus of online content in the educational domain, i.e. Massive Open Online Course
material, ranging from course forum text to subtitles of online video lectures, that has been developed via large-scale crowdsourcing.
The English source text is manually translated into 11 European and BRIC languages using the CrowdFlower platform. During the
process several challenges arose which mainly involved the in-domain text genre, the large text volume, the idiosyncrasies of each
target language, the limitations of the crowdsourcing platform, as well as the quality assurance and workflow issues of the
crowdsourcing process. The corpus constitutes a product of the EU-funded TraMOOC project and is utilised in the project in order to
train, tune and test machine translation engines.
This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:
TraMOOC project (Translation for Massive Open Online Courses), funded by the European Commission under H2020-ICT2014/H2020-ICT-2014-1 under grant agreement number 644333.
ID Code:
23070
Deposited On:
08 Mar 2019 14:13 by
Thomas Murtagh
. Last Modified 20 Jan 2021 16:33