Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Dacura: a new solution to data harvesting and knowledge extraction for the historical sciences

Peregrine, Peter N., Brennan, Rob orcid logoORCID: 0000-0001-6546-6408, Currie, Thomas, Feeney, Kevin, François, Pieter, Turchin, Peter orcid logoORCID: 0000-0002-1292-8100 and Whitehouse, Harvey (2018) Dacura: a new solution to data harvesting and knowledge extraction for the historical sciences. Historical Methods: A Journal of Quantitative and Interdisciplinary History, 51 (3). pp. 165-174. ISSN 0161-5440

New advances in computer science address problems historical scientists face in gathering and evaluating the now vast data sources available through the Internet. As an example we introduce Dacura, a dataset curation platform designed to assist historical researchers in harvesting, evaluating, and curating high-quality information sets from the Internet and other sources. Dacura uses semantic knowledge graph technology to represent data as complex, inter-related knowledge allowing rapid search and retrieval of highly specific data without the need of a lookup table. Dacura automates the generation of tools to help non-experts curate high quality knowledge bases over time and to integrate data from multiple sources into its curated knowledge model. Together these features allow rapid harvesting and automated evaluation of Internet resources. We provide an example of Dacura in practice as the software employed to populate and manage the Seshat databank.
Item Type:Article (Published)
Uncontrolled Keywords:Data harvesting; RDF triplestore; database ontology; database metamodels; data curation
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Publisher:Taylor & Francis
Official URL:https://doi.org/10.1080/01615440.2018.1443863
Copyright Information:© 2018 Taylor & Francis
Funders:ALIGNED project funded by European Union Horizon 2020 program under project number 644055, ADAPT Centre for Digital Content Technology, SFI Research Centres Programme (Grant 13/RC/2106) co-funded by the European Regional Development Fund
ID Code:22983
Deposited On:15 Feb 2019 12:55 by Thomas Murtagh . Last Modified 20 Sep 2019 03:30

Full text available as:

[thumbnail of HistoricalMethodsJournal2018Metamodel_paper_revised2.pdf]
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


Downloads per month over past year

Archive Staff Only: edit this record