Ensuring agreement between the subject and the main verb is crucial for the correctness of the information that a sentence conveys. While generating correct subject-verb agreement
is relatively straightforward in rule-based approaches to Machine Translation (RBMT), today’s
leading statistical Machine Translation (SMT) systems often fail to generate correct subject-verb
agreements, especially when the target language is morphologically richer than the source language. The main problem is that one surface verb form in the source language corresponds to
many surface verb forms in the target language. To deal with subject-verb agreement we built a
hybrid SMT system that augments source verbs with extra linguistic information drawn from their
source-language context. This information, in the form of labels attached to verbs that indicate
person and number, creates a closer association between a verb from the source and a verb in the
target language. We used our preprocessing approach on English as source language and built an
SMT system for translation to French. In a range of experiments, the results show improvements
in translation quality for our augmented SMT system over a Moses baseline engine, on both automatic and manual evaluations, for the majority of cases where the subject-verb agreement was
previously incorrectly translated.