Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Translation crowdsourcing: creating a multilingual corpus of online educational content

Sosoni, Vilelmini orcid logoORCID: 0000-0002-9583-4651, Kermanidis, Katia Lida orcid logoORCID: 0000-0002-3270-5078, Stasimioti, Maria orcid logoORCID: 0000-0002-3270-5078, Naskos, Thanasis, Takoulidou, Eirini, van Zaanen, Menno, Castilho, Sheila orcid logoORCID: 0000-0002-8416-6555, Georgakopoulou, Panayota orcid logoORCID: 0000-0001-9780-1813, Kordoni, Valia and Egg, Markus (2018) Translation crowdsourcing: creating a multilingual corpus of online educational content. In: 11th International Conference on Language Resources and Evaluation, 7-12 May 2018, Miyazaki, Japan. ISBN 979-10-95546-00-9

Abstract
The present work describes a multilingual corpus of online content in the educational domain, i.e. Massive Open Online Course material, ranging from course forum text to subtitles of online video lectures, that has been developed via large-scale crowdsourcing. The English source text is manually translated into 11 European and BRIC languages using the CrowdFlower platform. During the process several challenges arose which mainly involved the in-domain text genre, the large text volume, the idiosyncrasies of each target language, the limitations of the crowdsourcing platform, as well as the quality assurance and workflow issues of the crowdsourcing process. The corpus constitutes a product of the EU-funded TraMOOC project and is utilised in the project in order to train, tune and test machine translation engines.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:parallel corpus; MOOCs; online educational text; crowdsourcing
Subjects:Social Sciences > Distance education
Social Sciences > Education
Social Sciences > Educational technology
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > ADAPT
Published in: Proceedings of the 11th International Conference on Language Resources and Evaluation. . LREC. ISBN 979-10-95546-00-9
Publisher:LREC
Official URL:http://www.lrec-conf.org/proceedings/lrec2018/pdf/...
Copyright Information:© 2018 LREC. CC 4.0
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:TraMOOC project (Translation for Massive Open Online Courses), funded by the European Commission under H2020-ICT2014/H2020-ICT-2014-1 under grant agreement number 644333.
ID Code:23070
Deposited On:08 Mar 2019 14:13 by Thomas Murtagh . Last Modified 20 Jan 2021 16:33
Documents

Full text available as:

[thumbnail of Translation Crowdsourcing.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
242kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record