Castilho, Sheila ORCID: 0000-0002-8416-6555 (2022) How much context span is enough? Examining context-related issues for document-level MT. In: 13th Language Resources and Evaluation Conference, 21-23 June 2022, Marseille, France.
Abstract
This paper analyses how much context span is necessary to solve different context-related issues, namely, reference, ellipsis, gender,
number, lexical ambiguity, and terminology when translating from English into Portuguese. We use the DELA corpus, which consists
of 60 documents and six different domains (subtitles, literary, news, reviews, medical, and legislation). We find that the shortest context
span to disambiguate issues can appear in different positions in the document including preceding, following, global, world knowledge;
and that the average length depends on the issue types as well as the domain. Additionally, we show that the standard approach of
relying on only two preceding sentences as context might not be enough depending on the domain and issue types.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Additional Information: | pp. 3017‑3025 |
Uncontrolled Keywords: | document-level; context span |
Subjects: | Computer Science > Computational linguistics Computer Science > Machine translating Humanities > Language Humanities > Linguistics Humanities > Translating and interpreting |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT |
Published in: | Proceedings of the 13th International Conference on Language Resources and Evaluation. . |
Official URL: | http://www.lrec-conf.org/proceedings/lrec2022/pdf/... |
Copyright Information: | © 2022 European Language Resources Association (ELRA) |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
Funders: | Irish research council (GOIPD/2020/69), Science Foundation Ireland through the SFI Research Centres Programme (Grant 13/RC/2106 P2)., Adapt Centre, Dublin City University |
ID Code: | 27009 |
Deposited On: | 08 Apr 2022 16:39 by Sheila Castilho . Last Modified 16 Nov 2023 13:05 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial 4.0 177kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record