CCG contextual labels in hierarchical phrase-based SMT
Almaghout, Hala and Jiang, Jie and Way, Andy (2011) CCG contextual labels in hierarchical phrase-based SMT. In: The 15th Annual Conference of the European Association for Machine Translation (EAMT-2011), 30-31 May 2011, Leuven, Belgium.
Full text available as:
In this paper, we present a method to employ target-side syntactic contextual information in a Hierarchical Phrase-Based system. Our method uses Combinatory Categorial Grammar (CCG) to annotate training data with labels that represent the left and right syntactic context of target-side phrases. These labels are then used to assign labels to nonterminals in hierarchical rules. CCG-based contextual labels help
to produce more grammatical translations by forcing phrases which replace nonterminals during translations to comply with the contextual constraints imposed by the labels. We present experiments which examine the performance of CCG contextual labels on Chinese–English and Arabic–English translation in the news and speech expressions domains using different data sizes and CCG-labeling settings. Our experiments show that our CCG contextual labels-based system achieved a 2.42% relative BLEU improvement over a PhraseBased baseline on Arabic–English translation and a 1% relative BLEU improvement over a Hierarchical Phrase-Based system baseline on Chinese–English translation.
Archive Staff Only: edit this record