Digital Library

cab1

 
Title:      A SCANONCE ALGORITHM FOR LARGE DATABASE MINING IMPLEMENTED IN SQL
Author(s):      Na Helian , Frank Wang
ISBN:      972-98947-0-1
Editors:      António Palma dos Reis and Pedro Isaías
Year:      2003
Edition:      1
Keywords:      SQL, data mining, relational database, data warehouse, association rules, I/O bottleneck .
Type:      Full Paper
First Page:      412
Last Page:      419
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      In support of the trend of data mining on large data warehouses, we propose a ScanOnce algorithm for association rule mining, which just needs to scan the transaction database once to generate all the possible rules. In contrast, the well-known Apriori algorithm requires repeated scans of the databases, thereby resulting in heavy I/O accesses particularly when considering large candidate datasets. Since PL/SQL is seamlessly integrated with Oracle’s database, we implement this algorithm in an Oracle9i environment. Attributing to its integrity in data structure, the full itemset counter tree can be stored in a relational table without any missing gap. The power of generating ad hoc queries in PL/SQL ensures fast access to any desired counter. The experiments show that this ScanOnce algorithm implemented in PL/SQL beats classic Apriori algorithm for large problem sizes, by factors ranging from 4 to more than 10, and this gap grows wider when the volume of transactions further grows up.
   

Social Media Links

Search

Login