Attribute Value Reordering for Efficient Hybrid OLAP

From National Research Council Canada

Download	View accepted manuscript: Attribute Value Reordering for Efficient Hybrid OLAP (PDF, 268 KiB)
Author	Search for: Kaser, O.; Search for: Lemire, Daniel
Format	Text, Article
Conference	The Sixth International Workshop on Data Warehousing and OLAP (DOLAP'03), November 7, 2003, New Orleans, Louisiana, USA
Subject	data cubes; normalization; chunking; multidimensional binary arrays; OLAP; MOLAP
Abstract	The normalization of a data cube is the process of choosing an ordering for the attribute values, and the chosen ordering will affect the physical storage of the cube's data. For large multidimensional arrays, proper normalization can lead to more efficient storage in hybrid OLAP contexts that store dense and sparse chunks differently. We show that it is NP-hard to compute an optimal normalization even for 1×3 chunks, although we find an exact algorithm for 1×2 chunks. When attributes are nearly statistically independent, we show that an optimal normalization is given by dimension-wise attribute frequency sorting, which can be done in time O(<em>dn</em>log(<em>n</em>)) for data cubes of size <em>n<sup>d</sup></em> . When attributes are not independent, we propose and evaluate a number of heuristics.Our optimized hybrid OLAP storage mechanism was observed to be 44% more storage efficient than ROLAP and the gains due to normalization alone accounted for 45% of this increase in efficiency.
Publication date	2003
In	The Sixth International Workshop on Data Warehousing and OLAP (DOLAP'03) [Proceedings].
Language	English
NRC number	NRCC 46510
NPARC number	5765320
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	5de58bb0-1836-41b1-9308-d2d95b0435a2
Record created	2009-03-29
Record modified	2021-01-05

Date modified:: 2024-07-27