Digital Library

cab1

 
Title:      RUN-LENGTH ENCODING APPLIED TO GRID SPACE STORING AND INDEXING
Author(s):      Frank Wang , Na Helian , Yau Jim Yip
ISBN:      972-9027-53-6
Editors:      Pedro Isaías
Year:      2002
Edition:      Single
Keywords:      Data warehousing, databases, index, grid space, run-length encoding .
Type:      Full Paper
First Page:      461
Last Page:      471
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      In data warehousing it is common for users to want 20-40 columns of data to be searchable. In database marketing, it can be 100's of attributes. Each query to mine the data may use a different combination of criteria. In this article, run-length encoding transforms the record space of all k key attributes into a one-dimensional run-length array. Such a compression scheme is proven to be compatible with common database access operations. As a result, in a bitmap space of k attributes, just k one-dimensional attribute arrays plus a one-dimensional run-length array need to be maintained. Physical database records do not exist as a whole because they are resolved after being input to the system and absorbed by those k+1 one-dimensional arrays. The storage overhead has been greatly reduced due to the limited size of those one-dimensional arrays: the size of the attribute array is equal to the cardinality of its attribute domain; the size of the run-length array is equal to, in the worst case, or very likely much smaller than the total number of the database records due to the probability of “1”s appearing next to each other in the bitmap space. A fully specified query retrieves a single record in at most two disk-block accesses and potentially at much faster speed as all those k+1 one-dimensional arrays can easily reside in the main memory. The experimental simulation confirms that the search, insertion and deletion operations can be executed efficiently in a compressed bitmap.
   

Social Media Links

Search

Login