Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Providing morphological information for SMT using neural networks

Passban, Peyman, Liu, Qun orcid logoORCID: 0000-0002-7000-1792 and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2017) Providing morphological information for SMT using neural networks. Prague Bulletin of Mathematical Linguistics (108). pp. 271-282. ISSN 1804-0462

Abstract
Treating morphologically complex words (MCWs) as atomic units in translation would not yield a desirable result. Such words are complicated constituents with meaningful subunits. A complex word in a morphologically rich language (MRL) could be associated with a number of words or even a full sentence in a simpler language, which means the surface form of complex words should be accompanied with auxiliary morphological information in order to provide a precise translation and a better alignment. In this paper we follow this idea and propose two different methods to convey such information for statistical machine translation (SMT) models. In the first model we enrich factored SMT engines by introducing a new morphological factor which relies on subword-aware word embeddings. In the second model we focus on the language-modeling component. We explore a subword-level neural language model (NLM) to capture sequence-, word- and subword-level dependencies. Our NLM is able to approximate better scores for conditional word probabilities, so the decoder generates more fluent translations. We studied two languages Farsi and German in our experiments and observed significant improvements for both of them.
Metadata
Item Type:Article (Published)
Refereed:Yes
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > ADAPT
Publisher:De Gruyter Open
Official URL:http://dx.doi.org/10.1515/pralin-2017-0026
Copyright Information:© 2017 PBML. Distributed under CC BY-NC-ND.
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Science Foundation Ireland at ADAPT: Centre for Digital Content Platform Research (Grant 13/RC/2106).
ID Code:23315
Deposited On:20 May 2019 08:53 by Thomas Murtagh . Last Modified 20 May 2019 08:53
Documents

Full text available as:

[thumbnail of Providing_Morphological_Information_for_SMT_Using_Neural_Networks[1].pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
146kB
Metrics

Altmetric Badge

Dimensions Badge

Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record