Skip to main content
DORAS
DCU Online Research Access Service
Login (DCU Staff Only)
How much context span is enough? Examining context-related issues for document-level MT

Castilho, Sheila ORCID: 0000-0002-8416-6555 (2022) How much context span is enough? Examining context-related issues for document-level MT. In: 13th Language Resources and Evaluation Conference, 21-23 June 2022, Marseille, France. (In Press)

Full text available as:

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
177kB

Abstract

This paper analyses how much context span is necessary to solve different context-related issues, namely, reference, ellipsis, gender, number, lexical ambiguity, and terminology when translating from English into Portuguese. We use the DELA corpus, which consists of 60 documents and six different domains (subtitles, literary, news, reviews, medical, and legislation). We find that the shortest context span to disambiguate issues can appear in different positions in the document including preceding, following, global, world knowledge; and that the average length depends on the issue types as well as the domain. Additionally, we show that the standard approach of relying on only two preceding sentences as context might not be enough depending on the domain and issue types.

Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Additional Information:pp. 3017‑3025
Uncontrolled Keywords:document-level; context span
Subjects:Computer Science > Computational linguistics
Computer Science > Machine translating
Humanities > Language
Humanities > Linguistics
Humanities > Translating and interpreting
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Initiatives and Centres > ADAPT
Published in: Proceedings of the 13th International Conference on Language Resources and Evaluation. .
Official URL:http://www.lrec-conf.org/proceedings/lrec2022/index.html
Copyright Information:© 2022 European Language Resources Association (ELRA) (CC-BY-NC-4.0)
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Irish research council (GOIPD/2020/69), Science Foundation Ireland through the SFI Research Centres Programme (Grant 13/RC/2106 P2)., Adapt Centre, Dublin City University
ID Code:27009
Deposited On:08 Apr 2022 16:39 by Sheila Castilho Monteiro de sousa . Last Modified 29 Jun 2022 16:00

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

  • Student Email
  • Staff Email
  • Student Apps
  • Staff Apps
  • Loop
  • Disclaimer
  • Privacy
  • Contact Us