Scriney, Michael ORCID: 0000-0001-6813-2630, McCarthy, Suzanne ORCID: 0000-0002-0695-7052, McCarren, Andrew ORCID: 0000-0002-7297-0984, Cappellari, Paolo and Roantree, Mark ORCID: 0000-0002-1329-2570 (2019) Automating data mart construction from semi-structured data sources. Computer Journal, 62 (3). pp. 394-413. ISSN 0010-4620
Abstract
The global food and agricultural industry has a total market value of USD 8 trillion in 2016, and decision makers in the Agri sector require appropriate tools and up-to-date information to make predictions across a range of products and areas. Traditionally, these requirements are met with information processed into a data warehouse and data marts constructed for analyses. Increasingly however, data is coming from outside the enterprise and often in unprocessed forms. As these sources are outside the control of companies, they are prone to change and new sources may appear. In these cases, the process of accommodating these sources can be costly and very time consuming. To automate this process, what is required is a sufficiently robust Extract-Transform-Load (ETL) process; external sources are mapped to some form of ontology, and an integration process to merge the specific data sources. In this paper, we present an approach to automating the integration of data sources in an Agri environment, where new sources are examined before an attempt to merge them with existing data marts. Our validation uses three separate case studies of real world data to demonstrate the robustness of our approach and the efficiency of materialising data marts
Metadata
Item Type: | Article (Published) |
---|---|
Refereed: | Yes |
Uncontrolled Keywords: | Data Model Transformation; Semi structured data; ETL; Data Marts |
Subjects: | UNSPECIFIED |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > INSIGHT Centre for Data Analytics |
Publisher: | Oxford University Press |
Official URL: | https://dx.doi.org/10.1093/comjnl/bxy064 |
Copyright Information: | © 2018 The British Computer Society. |
Funders: | Science Foundation Ireland under grant number [SFI/12/RC/2289] |
ID Code: | 27598 |
Deposited On: | 22 Aug 2022 09:47 by Mark Roantree . Last Modified 22 Aug 2022 09:47 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial 3.0 649kB |
Metrics
Altmetric Badge
Dimensions Badge
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record