Recent years have seen increasing use of large-scale distributed systems such as Grids,
Clouds, planetary-scale wide-area systems, large-scale enterprise clusters, and peer-to-peer systems.
Such platforms attract applications such as scientific computing, data sharing and dissemination,
data analysis and mining, and streaming multimedia. While these platforms scale well
and their deployment cost is low, they present several challenges such as heterogeneous machine
configurations and workloads, dynamism due to load fluctuations, and varying levels of
connectivity based on the network topology.
Users who submit distributed applications to be deployed in volunteer grids or looselycoupled
systems desire a reliable deployment. Unfortunately, in these environments there exists
uncertainty about the future state of system resources. Nodes chosen for deployment may become
overloaded, causing resource requirements to be violated; resource requirements were
originally established in applications to ensure high quality of service.
Further, the emergence of MapReduce applications in cloud environments has presented
several challenges. Managing the allocation of resources in the cloud for virtualized MapReduce
clusters in order to optimize for energy savings and performance goals are difficult problems.
In this dissertation, we present novel techniques for resource discovery in large-scale systems
to facilitate the successful deployment of distributed applications, providing statistical
guarantees to applications for their resource requirements. Further, we present novel techniques
for the deployment of MapReduce applications in non-traditional environments, optimizing for
energy-savings and performance goals.