Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Leveraging machine translation for cross-lingual fine-grained cyberbullying classification amongst pre-adolescents

Verma, Kanishk orcid logoORCID: 0000-0001-7172-4098, Popović, Maja orcid logoORCID: 0000-0001-8234-8745, Poulis, Alexandros, Cherkasova, Yelena, Ó hÓbáin, Cathal, Mazzone, Angela orcid logoORCID: 0000-0002-5858-8033, Milosevic, Tijana orcid logoORCID: 0000-0003-1502-7479 and Davis, Brian orcid logoORCID: 0000-0002-5759-2655 (2022) Leveraging machine translation for cross-lingual fine-grained cyberbullying classification amongst pre-adolescents. Natural Language Engineering, 29 (6). ISSN 1351-3249

Abstract
Cyberbullying is the wilful and repeated infliction of harm on an individual using the Internet and digital technologies. Similar to face-to-face bullying, cyberbullying can be captured formally using the Routine Activities Model (RAM) whereby the potential victim and bully are brought into proximity of one another via the interaction on online social networking (OSN) platforms. Although the impact of the COVID-19 (SARS-CoV-2) restrictions on the online presence of minors has yet to be fully grasped, studies have reported that 44% of pre-adolescents have encountered more cyberbullying incidents during the COVID-19 lockdown. Transparency reports shared by OSN companies indicate an increased take-downs of cyberbullying-related comments, posts or content by artificially intelligence moderation tools. However, in order to efficiently and effectively detect or identify whether a social media post or comment qualifies as cyberbullying, there are a number factors based on the RAM, which must be taken into account, which includes the identification of cyberbullying roles and forms. This demands the acquisition of large amounts of fine-grained annotated data which is costly and ethically challenging to produce. In addition where fine-grained datasets do exist they may be unavailable in the target language. Manual translation is costly and expensive, however, state-of-the-art neural machine translation offers a workaround. This study presents a first of its kind experiment in leveraging machine translation to automatically translate a unique pre-adolescent cyberbullying gold standard dataset in Italian with fine-grained annotations into English for training and testing a native binary classifier for pre-adolescent cyberbullying. In addition to contributing high-quality English reference translation of the source gold standard, our experiments indicate that the performance of our target binary classifier when trained on machine-translated English output is on par with the source (Italian) classifier.
Metadata
Item Type:Article (Published)
Refereed:Yes
Uncontrolled Keywords:Corpus linguistics; Text classification; Machine translation; Cyberbullying; Language resources
Subjects:Computer Science > Computational linguistics
Computer Science > Machine learning
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > ADAPT
Research Institutes and Centres > Anti-Bullying Research Centre (ABC)
Publisher:Cambridge University Press
Official URL:https://doi.org/10.1017/S1351324922000341
Copyright Information:© 2022 The Authors.
Funders:Irish Research Council grant number EPSPG/2021/161, Google, Ireland, Facebook/Meta Content Policy Award, Phase 2: Co-designing with children, European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No. 801522, Science Foundation Ireland through the ADAPT Centre for Digital Content Technology grant number 13/RC/2106_P2., European Regional Development Fund
ID Code:28372
Deposited On:26 May 2023 09:37 by Maja Popovic . Last Modified 19 Jan 2024 16:58
Documents

Full text available as:

[thumbnail of leveraging-machine-translation-for-cross-lingual-fine-grained-cyberbullying-classification-amongst-pre-adolescents.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial-No Derivative Works 4.0
477kB
Metrics

Altmetric Badge

Dimensions Badge

Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record