Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Human evaluation of English–Irish transformer-based NMT

Lankford, Séamus, Afli, Haithem orcid logoORCID: 0000-0002-7449-4707 and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2022) Human evaluation of English–Irish transformer-based NMT. Information, 13 (7). ISSN 2078-2489

Abstract
In this study, a human evaluation is carried out on how hyperparameter settings impact the quality of Transformer-based Neural Machine Translation (NMT) for the low-resourced English--Irish pair. SentencePiece models using both Byte Pair Encoding (BPE) and unigram approaches were appraised. Variations in model architectures included modifying the number of layers, evaluating the optimal number of heads for attention and testing various regularisation techniques. The greatest performance improvement was recorded for a Transformer-optimized model with a 16k BPE subword model. Compared with a baseline Recurrent Neural Network (RNN) model, a Transformer-optimized model demonstrated a BLEU score improvement of 7.8 points. When benchmarked against Google Translate, our translation engines demonstrated significant improvements. Furthermore, a quantitative fine-grained manual evaluation was conducted which compared the performance of machine translation systems. Using the Multidimensional Quality Metrics (MQM) error taxonomy, a human evaluation of the error types generated by an RNN-based system and a Transformer-based system was explored. Our findings show the best-performing Transformer system significantly reduces both accuracy and fluency errors when compared with an RNN-based model.
Metadata
Item Type:Article (Published)
Refereed:Yes
Uncontrolled Keywords:human evaluation; MQM; neural machine translation; Irish; low-resource languages
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > ADAPT
Publisher:MDPI
Official URL:https://doi.org/10.3390/info13070309
Copyright Information:© 2022 Authors.
Funders:Science Foundation Ireland (SFI) Research Centres Programme (Grant 13/RC/2016), European Regional Development Fund, Munster Technological University
ID Code:28340
Deposited On:18 May 2023 12:36 by Seamus Lankford . Last Modified 18 May 2023 14:59
Documents

Full text available as:

[thumbnail of slankford-mdpi.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 4.0
640kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record