Basile, Pierpaolo ORCID: 0000-0002-0545-1105, Caputo, Annalina ORCID: 0000-0002-7144-8545, Lawless, Séamus ORCID: 0000-0001-6302-258X and Semeraro, Giovanni ORCID: 0000-0001-6883-1853 (2019) Diachronic analysis of entities by exploiting Wikipedia page revisions. In: Recent Advances in Natural Language Processing, 2-4 Sept 2019, Varna, Bulgaria.
Abstract
In the last few years, the increasing availability of large corpora spanning several time periods has opened new opportunities for the diachronic analysis of language. This type of analysis can bring to the light not only linguistic phenomena related to the shift of word meanings over time, but it can also be used to study the impact that societal and cultural trends have on this language change. This paper introduces a new resource for performing the diachronic analysis of named entities built upon Wikipedia page revisions. This resource enables the analysis over time of changes in the relations between entities (concepts), surface forms (words), and the contexts surrounding entities and surface forms, by analysing the whole history of Wikipedia internal links. We provide some useful use cases that prove the impact of this resource on diachronic studies and delineate some possible future usage.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Subjects: | Computer Science > Machine learning |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT |
Published in: | Proceedings of Recent Advances in Natural Language Processing. . |
Official URL: | https://dx.doi.org/10.26615/978-954-452-056-4_011 |
Copyright Information: | © 2019 The Authors. Open Access (CC-BY 4.0) |
Funders: | “TALIA- Territorial Appropriation of Leading-edge Innovation Action” projec, Interreg-Mediterranean program for increasing transnational activity of innovative clusters and networks of key sectors of the MED area (2018-2019), ADAPT Centre for Digital Content Technology, SFI- Science Foundation Ireland Research Centres Programme (Grant SFI 13/RC/2106) and co-funded under the European Regional Development Fund |
ID Code: | 27595 |
Deposited On: | 19 Aug 2022 16:02 by Thomas Murtagh . Last Modified 08 Feb 2023 15:51 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
392kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record