Extending the scope of out-of-domain: examining QA models in multiple subdomains

Lyu, Chenyang ORCID: 0009-0002-6733-5879, Foster, Jennifer ORCID: 0000-0002-7789-4853 and Graham, Yvette ORCID: 0000-0001-6741-4855 (2022) Extending the scope of out-of-domain: examining QA models in multiple subdomains. In: Third Workshop on Insights from Negative Results in NLP, 26 May 2022, Dublin, Ireland.

Abstract
Metadata
Downloads
Documents

[+][-]

Abstract

Past work that investigates out-of-domain performance of QA systems has mainly focused on general domains (e.g. news domain, wikipedia domain), underestimating the importance of subdomains defined by the internal characteristics of QA datasets. In this paper, we extend the scope of ``out-of-domain'' by splitting QA examples into different subdomains according to their internal characteristics including question type, text length, answer position. We then examine the performance of QA systems trained on the data from different subdomains. Experimental results show that the performance of QA systems can be significantly reduced when the train data and test data come from different subdomains. These results question the generalizability of current QA systems in multiple subdomains, suggesting the need to combat the bias introduced by the internal characteristics of QA datasets.

Metadata

Item Type:	Conference or Workshop Item (Paper)
Event Type:	Workshop
Refereed:	Yes
Subjects:	Computer Science > Artificial intelligence Computer Science > Computational linguistics Computer Science > Machine learning
DCU Faculties and Centres:	DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Published in:	Proceedings of the Third Workshop on Insights from Negative Results in NLP. . Association for Computational Linguistics (ACL).
Publisher:	Association for Computational Linguistics (ACL)
Official URL:	https://doi.org/10.18653/v1/2022.insights-1.4
Copyright Information:	© 2022 Association for Computational Linguistics.
Funders:	Science Foundation Ireland through the SFI Centre for Research Training in Machine Learning (18/CRT/6183)
ID Code:	29144
Deposited On:	19 Oct 2023 13:21 by Jennifer Foster . Last Modified 19 Oct 2023 13:21

Documents

Full text available as:

Preview

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial 4.0
401kB

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

DORAS | DCU Research Repository

Extending the scope of out-of-domain: examining QA models in multiple subdomains

Downloads