Browsing by Author "Singh, Aameek"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Exploiting Spatio-Temporal Tradeoffs for Energy Efficient MapReduce in the Cloud(2010-04-07) Cardosa, Michael; Singh, Aameek; Pucha, Himabindu; Chandra, AbhishekMapReduce is a distributed computing paradigm that is being widely used for building large-scale data processing applications like content indexing, data mining and log file analysis. Offered in the cloud, users can construct their own virtualized MapReduce clusters using virtual machines (VMs) managed by the cloud service provider. However, to maintain low costs for such cloud services, cloud operators are required to optimize the energy consumption of these applications. In this paper, we describe a unique spatio-temporal tradeoff for achieving energy efficiency for MapReduce jobs in such virtualized environments. The tradeoff includes efficient spatial fitting of VMs on servers to achieve high utilization of machine resources, as well as balanced temporal fitting of servers with VMs having similar runtimes to ensure that a server runs at a high utilization throughout its uptime. To study this tradeoff, we propose a set of metrics that quantify the different sources of resource wastage. We then propose VM placement algorithms that explicitly incorporate these spatio-temporal tradeoffs, by combining a recipe placement algorithm for spatial fitting with a temporal binning algorithm for time balancing. We also propose an incremental time balancing algorithm (ITB) that can improve the energy efficiency even further by transparently increasing the cluster size for MapReduce jobs, while improving their performance at the same time. Our simulation results show that our spatio-temporal placement algorithms achieve energy savings between 20-35% over existing spatially-efficient placement techniques, and within 12% of a baseline lower-bound algorithm. Further, the ITB algorithm achieves additional savings of up to 15% over the spatio-temporal algorithms by reducing job runtimes by 5-35%.Item STEAMEngine: Driving MapReduce Provisioning in the Cloud(2010-09-28) Cardosa, Michael; Narang, Piyush; Chandra, Abhishek; Pucha, Himabindu; Singh, AameekMapReduce has gained in popularity as a distributed data analysis paradigm, particularly in the cloud, where MapReduce jobs are run on virtual clusters. The provisioning of MapReduce jobs in the cloud is an important problem for optimizing several user as well as provider-side metrics, such as runtime, cost, throughput, energy, and load. In this paper, we present a provisioning framework called STEAMEngine that consists of provisioning algorithms to optimize these metrics through a set of common building blocks. These building blocks enable spatio-temporal tradeoffs unique to MapReduce provisioning: along with their resource requirements (spatial component), a MapReduce job runtime (temporal component) is a critical element for any resource provisioning algorithm. We also describe two novel provisioning algorithms - a user-driven performance optimization and a provider-driven energy optimization - that leverage these building blocks. Our experimental results based on an Amazon EC2 cluster and a local 6-node Xen/Hadoop cluster show the benefits of STEAMEngine through improvements in performance and energy via the use of these algorithms and building blocks.