Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Alignment-guided chunking

Ma, Yanjun, Stroppa, Nicolas and Way, Andy orcid logoORCID: 0000-0001-5736-5930 (2007) Alignment-guided chunking. In: TMI-07 - Proceedings of The 11th Conference on Theoretical and Methodological Issues in Machine Translation, 7-9 September 2007, Skövde, Sweden.

Abstract
We introduce an adaptable monolingual chunking approach–Alignment-Guided Chunking (AGC)–which makes use of knowledge of word alignments acquired from bilingual corpora. Our approach is motivated by the observation that a sentence should be chunked differently depending the foreseen end-tasks. For example, given the different requirements of translation into (say) French and German, it is inappropriate to chunk up an English string in exactly the same way as preparation for translation into one or other of these languages. We test our chunking approach on two language pairs: French–English and German–English, where these two bilingual corpora share the same English sentences. Two chunkers trained on French–English (FE-Chunker) and German–English(DE-Chunker ) respectively are used to perform chunking on the same English sentences. We construct two test sets, each suitable for French– English and German–English respectively. The performance of the two chunkers is evaluated on the appropriate test set and with one reference translation only, we report Fscores of 32.63% for the FE-Chunker and 40.41% for the DE-Chunker.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Subjects:Computer Science > Machine translating
DCU Faculties and Centres:Research Institutes and Centres > National Centre for Language Technology (NCLT)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Official URL:http://www.computing.dcu.ie/~away/TMI-07/
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Science Foundation Ireland, SFI OS/IN/1732
ID Code:562
Deposited On:15 Sep 2008 11:33 by DORAS Administrator . Last Modified 16 Nov 2018 09:42
Documents

Full text available as:

[thumbnail of Maetal_TMI07.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
128kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record