Lyu, Chenyang ORCID: 0009-0002-6733-5879, Foster, Jennifer ORCID: 0000-0002-7789-4853 and Graham, Yvette ORCID: 0000-0001-6741-4855 (2022) Extending the scope of out-of-domain: examining QA models in multiple subdomains. In: 3rd Workshop on Insights from Negative Results in NLP, Insights 2022, 26 May 2022, Dublin, Ireland. ISBN 978-195591740-7
Abstract
Past work that investigates out-of-domain performance of QA systems has mainly focused on general domains (e.g. news domain, wikipedia domain), underestimating the importance of subdomains defined by the internal characteristics of QA datasets. In this paper, we extend the scope of “out-of-domain” by splitting QA examples into different subdomains according to their internal characteristics including question type, text length, answer position. We then examine the performance of QA systems trained on the data from different subdomains. Experimental results show that the performance of QA systems can be significantly reduced when the train data and test data come from different subdomains. These results question the generalizability of current QA systems in multiple subdomains, suggesting the need to combat the bias introduced by the internal characteristics of QA datasets.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Workshop |
Refereed: | Yes |
Uncontrolled Keywords: | Internal characteristics; News domain; Performance; QA system; Question type; Splittings; Subdomain; Test data; Text length; Wikipedia |
Subjects: | UNSPECIFIED |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Published in: | Insights 2022 - 3rd Workshop on Insights from Negative Results in NLP, Proceedings of the Workshop. . Association for Computational Linguistics (ACL). ISBN 978-195591740-7 |
Publisher: | Association for Computational Linguistics (ACL) |
Official URL: | https://www.scopus.com/inward/record.uri?partnerID... |
Copyright Information: | © 2022 Association for Computational Linguistics |
Funders: | Science Foundation Ireland, SFI Centre for Research Training in Machine Learning (18/CRT/6183). |
ID Code: | 29136 |
Deposited On: | 18 Oct 2023 11:39 by Vidatum Academic . Last Modified 18 Oct 2023 11:39 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial-Share Alike 4.0 401kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record