Digital Library

cab1

 
Title:      A NEW APPROACH FOR DOCUMENT CLUSTERING USING MAPREDUCE (VAR-SECTING CLUSTERING)
Author(s):      Abdelrahman Elsayed, Osama Ismail, Hoda M. O. Mokhtar
ISBN:      978-989-8533-39-5
Editors:      Ajith P. Abraham, Antonio Palma dos Reis and Jörg Roth
Year:      2015
Edition:      Single
Keywords:      Clustering; MapReduce; K-means algorithm; Distributed computing
Type:      Full Paper
First Page:      57
Last Page:      64
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      Document clustering is the process of grouping related documents with each other. It facilitates organizing search results and document management. K-means algorithm and its variant bisecting k-means have been applied for document clustering and approved good clustering results. The increased number of available documents requires utilizing of distributed computing and huge number of computer resources which are available through cloud computing. This paper introduces Var-secting k-means algorithm. In addition to generating binary tree as in Bisecting k-means algorithm, it can generate hierarchy tree with variable number of nodes per tree level. The experimental results show that Var-secting k-means algorithm utilizes distributed computing nodes better than Bisecting k-means, especially when using MapReduce programming model.
   

Social Media Links

Search

Login