A memory-based classification approach to marker-based EBMT
van den Bosch, Antal, Stroppa, Nicolas and Way, AndyORCID: 0000-0001-5736-5930
(2007)
A memory-based classification approach to marker-based EBMT.
In: METIS-II Workshop on New Approaches to Machine Translation, 11 January 2007, Leuven, Belgium.
We describe a novel approach to example-based machine translation that makes use of marker-based chunks, in which the decoder is a memory-based classifier. The classifier is trained to map trigrams of source-language chunks onto trigrams of target-language chunks; then, in a second
decoding step, the predicted trigrams are rearranged according to their overlap. We present the first results of this method on a Dutch-to-English translation system
using Europarl data. Sparseness of the class space causes the results to lag behind a baseline phrase-based SMT system.
In a further comparison, we also
apply the method to a word-aligned version
of the same data, and report a smaller
difference with a word-based SMT system.
We explore the scaling abilities of the
memory-based approach, and observe linear
scaling behavior in training and classification
speed and memory costs, and loglinear
BLEU improvements in the amount
of training examples.