Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Automating data mart construction from semi-structured data sources

Scriney, Michael orcid logoORCID: 0000-0001-6813-2630, McCarthy, Suzanne orcid logoORCID: 0000-0002-0695-7052, McCarren, Andrew orcid logoORCID: 0000-0002-7297-0984, Cappellari, Paolo and Roantree, Mark orcid logoORCID: 0000-0002-1329-2570 (2019) Automating data mart construction from semi-structured data sources. Computer Journal, 62 (3). pp. 394-413. ISSN 0010-4620

The global food and agricultural industry has a total market value of USD 8 trillion in 2016, and decision makers in the Agri sector require appropriate tools and up-to-date information to make predictions across a range of products and areas. Traditionally, these requirements are met with information processed into a data warehouse and data marts constructed for analyses. Increasingly however, data is coming from outside the enterprise and often in unprocessed forms. As these sources are outside the control of companies, they are prone to change and new sources may appear. In these cases, the process of accommodating these sources can be costly and very time consuming. To automate this process, what is required is a sufficiently robust Extract-Transform-Load (ETL) process; external sources are mapped to some form of ontology, and an integration process to merge the specific data sources. In this paper, we present an approach to automating the integration of data sources in an Agri environment, where new sources are examined before an attempt to merge them with existing data marts. Our validation uses three separate case studies of real world data to demonstrate the robustness of our approach and the efficiency of materialising data marts
Item Type:Article (Published)
Uncontrolled Keywords:Data Model Transformation; Semi structured data; ETL; Data Marts
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > INSIGHT Centre for Data Analytics
Publisher:Oxford University Press
Official URL:https://dx.doi.org/10.1093/comjnl/bxy064
Copyright Information:© 2018 The British Computer Society.
Funders:Science Foundation Ireland under grant number [SFI/12/RC/2289]
ID Code:27598
Deposited On:22 Aug 2022 09:47 by Mark Roantree . Last Modified 22 Aug 2022 09:47

Full text available as:

[thumbnail of Roantree-Computer_Journal-2019.pdf]
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial 3.0


Downloads per month over past year

Archive Staff Only: edit this record