Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Categorising Corruption in the Vaccine Discourse: A General Taxonomy, Data Set, and Evaluation of LLMs for Classifying Corruption Dialogue in Social Media

dos Santos, Vitor Gaboardi, Santos, Guto Leoni orcid logoORCID: 0000-0002-0257-4214, Egli, Antonia orcid logoORCID: 0000-0002-0151-0884, Kahvazadeh, Estatira, Doolin, Bill, Endo, Patricia Takako orcid logoORCID: 0000-0002-9163-5583 and Lynn, Theo orcid logoORCID: 0000-0001-9284-7580 (2024) Categorising Corruption in the Vaccine Discourse: A General Taxonomy, Data Set, and Evaluation of LLMs for Classifying Corruption Dialogue in Social Media. In: International Conference on Advances in Social Networks Analysis and Mining. ISBN 978-3-031-78541-2

Abstract
Real or perceived corruption can have a damaging effect on health care services and outcomes. In particular, research suggests perceived corruption had a significant impact on COVID-19 vaccination. Given the role of social media in health communications, identifying and understanding perceived corruption related to vaccines and vaccination is critical to build societal cohesion and public trust in health institutions and strategies, manage and combat misinformation and disinformation, and design more effective policies, interventions, and communications strategies. There is a dearth of research on binary and multi-class classification of corruption dialogues in health or otherwise. We address this gap by introducing a general hierarchical corruption dialogue taxonomy (HCDT) and formulating binary and multi-class classification tasks based on the HCDT. We also create a vaccine-specific labelled dataset for each task, and fine-tune three large language models (BERT, RoBERTa, and BERTweet) based on these datasets. We evaluate the performance of these models in the binary and multi-class classification tasks. While all models performed similarly for the binary task, RoBERTa performed best for multi-class classification of corruption dialogue.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:Corruption, Large Language Models, BERT, Twitter, multi-class classification, vaccine, COVID-19
Subjects:Computer Science > World Wide Web
Social Sciences > Globalization
DCU Faculties and Centres:DCU Faculties and Schools > DCU Business School
Published in: Social Networks Analysis and Mining. ASONAM 2024. Lecture Notes in Computer Science 15211. Springer, Cham. ISBN 978-3-031-78541-2
Publisher:Springer, Cham
Official URL:https://link.springer.com/chapter/10.1007/978-3-03...
Copyright Information:Authors
ID Code:32857
Deposited On:02 Jul 2026 10:56 by Tam Nguyen . Last Modified 02 Jul 2026 10:57
Documents

Full text available as:

[thumbnail of Categorising Corruption in the Vaccine Discourse.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 4.0
764kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record