MPI for Big Data: New Tricks for an Old Dog
2013-10-21
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
MPI for Big Data: New Tricks for an Old Dog
Authors
Published Date
2013-10-21
Publisher
Type
Report
Abstract
The processing of massive amounts of data on clusters with finite amount of memory has become an important problem facing the parallel/distributed computing community. While MapReduce-style technologies provide an effective means for addressing various problems that fit within the MapReduce paradigm, there are many classes of problems for which this paradigm is ill-suited. In this paper we present a runtime system for traditional MPI programs that enables the efficient and transparent out-of-core execution of distributed-memory parallel programs. This system, called BDMPI, leverages the semantics of MPI's API to orchestrate the execution of a large number of MPI processes on much fewer compute nodes, so that the running processes maximize the amount of computation that they perform with the data fetched from the disk. BDMPI enables the development of efficient out-of-core parallel distributed memory codes without the high engineering and algorithmic complexities associated with multiple levels of blocking. BDMPI achieves significantly better performance than existing technologies on a single node as well as on a small cluster, and performs within 30% of optimized out-of-core implementations.
Keywords
Description
Related to
Replaces
License
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Suggested citation
LaSalle, Dominique; Karypis, George. (2013). MPI for Big Data: New Tricks for an Old Dog. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/215935.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.