Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Leveraging pre-trained language models for gender debiasing

Jain, Nishtha, Popović, Maja orcid logoORCID: 0000-0001-8234-8745, Groves, Declan and Specia, Lucia orcid logoORCID: 0000-0002-5495-3128 (2022) Leveraging pre-trained language models for gender debiasing. In: 13th Language Resources and Evaluation Conference, 20-25 June 2022, Marseille, France.

Abstract
Studying and mitigating gender and other biases in natural language have become important areas of research from both algorithmic and data perspectives. This paper explores the idea of reducing gender bias in a language generation context by generating gender variants of sentences. Previous work in this field has either been rule-based or required large amounts of gender balanced training data. These approaches are however not scalable across multiple languages, as creating data or rules for each language is costly and time-consuming. This work explores a light-weight method to generate gender variants for a given text using pre-trained language models as the resource, without any task-specific labelled data. The approach is designed to work on multiple languages with minimal changes in the form of heuristics. To showcase that, we have tested it on a high-resourced language, namely Spanish, and a low-resourced language from a different family, namely Serbian. The approach proved to work very well on Spanish, and while the results were less positive for Serbian, it showed potential even for languages where pre-trained models are less effective.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:gender debiasing; language generation; pre-trained language models
Subjects:Computer Science > Computational linguistics
Computer Science > Machine learning
Humanities > Language
Social Sciences > Gender
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > ADAPT
Published in: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022). . European Language Resources Association (ELRA).
Publisher:European Language Resources Association (ELRA)
Official URL:https://aclanthology.org/2022.lrec-1.235
Copyright Information:© European Language Resources Association (ELRA)
ID Code:28365
Deposited On:25 May 2023 11:27 by Maja Popovic . Last Modified 25 May 2023 11:27
Documents

Full text available as:

[thumbnail of 2022.lrec-1.235.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial 4.0
255kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record