Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Constructing data marts from web sources using a graph common model

Scriney, Michael orcid logoORCID: 0000-0001-6813-2630 (2018) Constructing data marts from web sources using a graph common model. PhD thesis, Dublin City University.

At a time when humans and devices are generating more information than ever, activities such as data mining and machine learning become crucial. These activities enable us to understand and interpret the information we have and predict, or better prepare ourselves for, future events. However, activities such as data mining cannot be performed without a layer of data management to clean, integrate, process and make available the necessary datasets. To that extent, large and costly data flow processes such as Extract-Transform-Load are necessary to extract from disparate information sources to generate ready-for-analyses datasets. These datasets are generally in the form of multi-dimensional cubes from which different data views can be extracted for the purpose of different analyses. The process of creating a multi-dimensional cube from integrated data sources is significant. In this research, we present a methodology to generate these cubes automatically or in some cases, close to automatic, requiring very little user interaction. A construct called a StarGraph acts as a canonical model for our system, to which imported data sources are transformed. An ontology-driven process controls the integration of StarGraph schemas and simple OLAP style functions generate the cubes or datasets. An extensive evaluation is carried out using a large number of agri data sources with user-defined case studies to identify sources for integration and the types of analyses required for the final data cubes.
Item Type:Thesis (PhD)
Date of Award:November 2018
Supervisor(s):Roantree, Mark
Uncontrolled Keywords:Data Analytics; Data Warehousing; Data Integration; ETL
Subjects:Computer Science > Software engineering
DCU Faculties and Centres:Research Institutes and Centres > INSIGHT Centre for Data Analytics
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 3.0 License. View License
Funders:This work is supported by Science Foundation Ireland under grant number [SFI/12/RC/2289]
ID Code:22387
Deposited On:21 Nov 2018 10:08 by Michael John Scriney . Last Modified 23 Aug 2019 08:57

Full text available as:

[thumbnail of Thesis_final.pdf]
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader


Downloads per month over past year

Archive Staff Only: edit this record