Digital Library

cab1

 
Title:      OUT-OF-CORE DATA HANDLING WITH PERIODIC PARTIAL RESULT MERGING
Author(s):      Sándor Juhász , Renáta Iváncsy
ISBN:      978-972-8924-88-1
Editors:      Ajith P. Abraham
Year:      2009
Edition:      Single
Keywords:      Out-of-core data processing, partitioning, efficient data handling with checkpoints
Type:      Full Paper
First Page:      50
Last Page:      58
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      Efficient handling of large amount of data is hindered by the fact that the data and the data structures used during the data processing do not fit into the main memory. A widely used solution for this problem is to use the partitioning approach, where the data set to be processed is split into smaller parts that can be processed in themselves in the main memory. Summarizing the results created from the smaller parts is done in a subsequent step. In this paper we give a brief overview of the different aspects of the partitioning approach, and seek for the most desirable approach to aggregate web log data. Based on these results we suggest and analyze a method that splits the original data set into blocks with equal sizes, and processes these blocks subsequently. After a processing step the main memory will contain the local result based on the currently processed block, that is merged afterwards with the global result of the blocks processed so far. By complexity analysis and experimental results we show that this approach is both fault tolerant and efficient when used in record-based data processing, if the results are significantly smaller than the original data, and a linear algorithm is available for merging the partial results. Also a method is suggested to adjust the block sizes dynamically in order to achieve best performance.
   

Social Media Links

Search

Login