Data cube computational model with Hadoop MapReduce
Wang, Bo, Gui, Hao, Roantree, Mark and O'Connor, Martin F.
(2014)
Data cube computational model with Hadoop MapReduce.
In: 10th International Conference on Web Information Systems and Technologies (WEBIST 2014), 3-5 Apr 2014, Barcelona, Spain.
ISBN 978-989-758-023-9
XML has become a widely used and well structured data format for digital document handling and message transmission. To find useful knowledge in XML data, data warehouse and OLAP applications aimed at providing supports for decision making should be developed. Apache Hadoop is an open source cloud computing framework that provides a distributed file system for large scale data processing. In this paper, we discuss an XML data cube model which offers us the complete views to observe XML data, and present a basic algorithm to implement its building process on Hadoop. To improve the efficiency, an optimized algorithm more suitable for this kind of XML data is also proposed. The experimental results given in the paper prove the effectiveness of our optimization strategies.