Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Extending the scope of out-of-domain: examining QA models in multiple subdomains

Lyu, Chenyang orcid logoORCID: 0009-0002-6733-5879, Foster, Jennifer orcid logoORCID: 0000-0002-7789-4853 and Graham, Yvette orcid logoORCID: 0000-0001-6741-4855 (2022) Extending the scope of out-of-domain: examining QA models in multiple subdomains. In: Third Workshop on Insights from Negative Results in NLP, 26 May 2022, Dublin, Ireland.

Abstract
Past work that investigates out-of-domain performance of QA systems has mainly focused on general domains (e.g. news domain, wikipedia domain), underestimating the importance of subdomains defined by the internal characteristics of QA datasets. In this paper, we extend the scope of ``out-of-domain'' by splitting QA examples into different subdomains according to their internal characteristics including question type, text length, answer position. We then examine the performance of QA systems trained on the data from different subdomains. Experimental results show that the performance of QA systems can be significantly reduced when the train data and test data come from different subdomains. These results question the generalizability of current QA systems in multiple subdomains, suggesting the need to combat the bias introduced by the internal characteristics of QA datasets.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Workshop
Refereed:Yes
Subjects:Computer Science > Artificial intelligence
Computer Science > Computational linguistics
Computer Science > Machine learning
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Published in: Proceedings of the Third Workshop on Insights from Negative Results in NLP. . Association for Computational Linguistics (ACL).
Publisher:Association for Computational Linguistics (ACL)
Official URL:https://doi.org/10.18653/v1/2022.insights-1.4
Copyright Information:© 2022 Association for Computational Linguistics.
Funders:Science Foundation Ireland through the SFI Centre for Research Training in Machine Learning (18/CRT/6183)
ID Code:29144
Deposited On:19 Oct 2023 13:21 by Jennifer Foster . Last Modified 19 Oct 2023 13:21
Documents

Full text available as:

[thumbnail of 2022.insights-1.4.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial 4.0
401kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record