Skip to main content
DORAS
DCU Online Research Access Service
Login (DCU Staff Only)
Automatic Translation, Context, and Supervised Learning in Comparative Politics

Courtney, Michael, Breen, Michael ORCID: 0000-0002-5857-9938, McMenamin, Iain ORCID: 0000-0002-1704-390X and McNulty, Gemma ORCID: 0000-0002-6909-6958 (2020) Automatic Translation, Context, and Supervised Learning in Comparative Politics. Journal of Information Technology and Politics . ISSN 1933-1681

Full text available as:

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
859kB

Abstract

This paper proves that automatic translation of multilingual newspaper documents deters neither human nor computer classification of political concepts. We show how theory-driven coding of newspaper text can be automated in several languages by monolingual researchers. Supervised machine learning is successfully applied to text in English from British, Spanish and German sources. The paper has three main findings. First, results from human coding directly in a foreign language do not differ from coding computer-translated text. Second, humans can code translated text as well as they can code untranslated prose in their mother tongue. Third, machine learning based on translated Spanish and German training sets can reproduce human coding as accurately as a system learning from English training sets.

Item Type:Article (Published)
Refereed:Yes
Uncontrolled Keywords:automatic translation; supervised learning; google translate; media; newspapers; comparative politics; text analysis; political text
Subjects:Computer Science > Machine learning
Computer Science > Machine translating
Humanities > Translating and interpreting
Social Sciences > International relations
Social Sciences > Mass media
Social Sciences > Political science
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Humanities and Social Science > School of Law and Government
Publisher:Taylor & Francis
Official URL:http://dx.doi.org/10.1080/19331681.2020.1731245
Copyright Information:© 2020 Taylor & Francis
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
Funders:Irish Research Council Grant Number GOIPD/2016/253.
ID Code:24233
Deposited On:21 Feb 2020 10:20 by Michael Breen . Last Modified 28 Feb 2022 13:47

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

Altmetric
- Altmetric
+ Altmetric
  • Student Email
  • Staff Email
  • Student Apps
  • Staff Apps
  • Loop
  • Disclaimer
  • Privacy
  • Contact Us