Automatic extraction of Arabic multiword expressions
Attia, Mohammed and Tounsi, Lamia and Pecina, Pavel and van Genabith, Josef and Toral , Antonio (2010) Automatic extraction of Arabic multiword expressions. In: the 7th Conference on Language Resources and Evaluation (LREC 2010)., May 2010., Valletta (Malta). .
Full text available as:
In this paper we investigate the automatic acquisition of Arabic Multiword Expressions (MWE). We propose three complementary approaches to extract MWEs from available data resources. The first approach relies on the correspondence asymmetries between Arabic Wikipedia titles and titles in 21 different languages. The second approach collects English MWEs from Princeton WordNet 3.0, translates the collection into Arabic using Google Translate, and utilizes different search engines to validate the output. The third uses lexical association measures to extract MWEs from a large unannotated corpus. We experimentally explore the feasibility of each approach and measure the quality and coverage of the output against gold standards.
Archive Staff Only: edit this record