Digital Library

cab1

 
Title:      pR: AUTOMATIC PARALLELIZATION OF DATA PARALLEL STATISTICAL COMPUTING CODES FOR R IN HYBRID MULTI-NODE AND MULTI-CORE ENVIRONMENTS
Author(s):      Paul Breimyer , Guruprasad Kora , William Hendrix , Neil Shah , Nagiza F. Samatova
ISBN:      978-972-8924-97-3
Editors:      Hans Weghorn and Pedro IsaĆ­as
Year:      2009
Edition:      V II, 2
Keywords:      Statistical Computing, Automatic Parallelization, Data-Parallel
Type:      Short Paper
First Page:      22
Last Page:      27
Language:      English
Cover:      cover          
Full Contents:      click to dowload Download
Paper Abstract:      The increasing size and complexity of modern scientific data sets challenge the capabilities of traditional statistical computing. High-Performance Statistical Parallel Computing is a promising strategy to address these challenges, especially as multi-core parallel computing architectures become increasingly prevalent. However, parallel statistical computing introduces implementation complexities and, therefore, an automatic parallelization approach would be ideal. Data-parallel statistical computations that aim to evaluate the same function on different subsets of data represent natural candidates for automatic parallelization due to their inherent inter-process independence. In this paper, we extend the pR middleware for the R open-source statistical environment to support automatic parallelization of data-parallel tasks in multi-node, multi-core, and hybrid environments. pR requires few or no changes to existing serial codes and yielded over 50% end-to-end execution time improvements in our tests, compared to the commonly used snow R package.
   

Social Media Links

Search

Login