Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Diachronic analysis of entities by exploiting Wikipedia page revisions

Basile, Pierpaolo orcid logoORCID: 0000-0002-0545-1105, Caputo, Annalina orcid logoORCID: 0000-0002-7144-8545, Lawless, Séamus orcid logoORCID: 0000-0001-6302-258X and Semeraro, Giovanni orcid logoORCID: 0000-0001-6883-1853 (2019) Diachronic analysis of entities by exploiting Wikipedia page revisions. In: Recent Advances in Natural Language Processing, 2-4 Sept 2019, Varna, Bulgaria.

Abstract
In the last few years, the increasing availability of large corpora spanning several time periods has opened new opportunities for the diachronic analysis of language. This type of analysis can bring to the light not only linguistic phenomena related to the shift of word meanings over time, but it can also be used to study the impact that societal and cultural trends have on this language change. This paper introduces a new resource for performing the diachronic analysis of named entities built upon Wikipedia page revisions. This resource enables the analysis over time of changes in the relations between entities (concepts), surface forms (words), and the contexts surrounding entities and surface forms, by analysing the whole history of Wikipedia internal links. We provide some useful use cases that prove the impact of this resource on diachronic studies and delineate some possible future usage.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Subjects:Computer Science > Machine learning
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > ADAPT
Published in: Proceedings of Recent Advances in Natural Language Processing. .
Official URL:https://dx.doi.org/10.26615/978-954-452-056-4_011
Copyright Information:© 2019 The Authors. Open Access (CC-BY 4.0)
Funders:“TALIA- Territorial Appropriation of Leading-edge Innovation Action” projec, Interreg-Mediterranean program for increasing transnational activity of innovative clusters and networks of key sectors of the MED area (2018-2019), ADAPT Centre for Digital Content Technology, SFI- Science Foundation Ireland Research Centres Programme (Grant SFI 13/RC/2106) and co-funded under the European Regional Development Fund
ID Code:27595
Deposited On:19 Aug 2022 16:02 by Thomas Murtagh . Last Modified 08 Feb 2023 15:51
Documents

Full text available as:

[thumbnail of RANLP011.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
392kB
Metrics

Altmetric Badge

Dimensions Badge

Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record