Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Knowledge and pre-trained language models inside and out: a deep-dive into datasets and external knowledge

Lyu, Chenyang (2023) Knowledge and pre-trained language models inside and out: a deep-dive into datasets and external knowledge. PhD thesis, Dublin City University.

Abstract
Pre-trained Language Models (PLMs) have greatly advanced the performance of various NLP tasks and have undoubtedly been serving as foundation models for this field. These pre-trained models are able to capture rich semantic patterns from large-scale text corpora and learn high-quality representations of texts. However, such models still have shortcomings - they underperform when faced with tasks that requires implicit external knowledge to be understood, which is difficult to learn with commonly employed pre-training objectives. Moreover, there lacks a comprehensive understanding of PLMs’ behaviour in learning knowledge during the fine-tuning phase. Therefore, in order to address the aforementioned challenges, we propose a set of approaches to inject external knowledge into PLMs and demonstrate experiments investigating their behaviour in learning knowledge during the fine-tuning phase, primarily focusing on Sentiment Analysis, Question Answering and Video Question Answering. Specifically, we introduce novel approaches explicitly using textual historical reviews of users and products for improving sentiment analysis. To overcome the problem of context-question lexical overlap and data scarcity for question generation, we propose a novel method making use of linguistic and semantic knowledge with heuristics. Additionally, we explore how to utilise multimodal (visual and acoustic) information/knowledge to improve Video Question Answering. Experiments conducted on benchmark datasets show that our proposed approaches achieve superior performance compared to state-of-the-art models, demonstrating the effectiveness of our methods for injecting external knowledge. Furthermore, we conduct a set of experiments investigating the learning of knowledge for PLMs for question answering under various scenarios. Results reveal that the internal characteristics of QA datasets can pose strong bias for PLMs when learning from downstream tasks datasets. Finally, we present an in-depth discussion of future directions for improving PLMs with external knowledge.
Metadata
Item Type:Thesis (PhD)
Date of Award:November 2023
Refereed:No
Supervisor(s):Foster, Jennifer and Graham, Yvette
Subjects:Computer Science > Artificial intelligence
Computer Science > Computational linguistics
Computer Science > Machine learning
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 License. View License
Funders:Science Foundation Ireland
ID Code:28948
Deposited On:02 Nov 2023 15:42 by Jennifer Foster . Last Modified 02 Nov 2023 15:42
Documents

Full text available as:

[thumbnail of PhD_thesis_Chenyang_Lyu.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial-No Derivative Works 4.0
10MB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record