Today, many organizations need to operate on data that is distributed
around the globe. This is inevitable due to the nature of data that is
generated in different locations such as video feeds from distributed
cameras, log files from distributed servers, and many others. Although
centralized cloud platforms have been widely used for data-intensive
applications, such systems are not suitable for processing geo-distributed
data due to high data transfer overheads. An alternative approach is to use
an Edge Cloud which reduces the network cost of transferring data by
distributing its computations globally. While the Edge Cloud is attractive
for geo-distributed data-intensive applications, extending existing cluster
computing frameworks to a wide-area environment must account for locality.
We propose Awan: a new locality-aware resource manager for geo-distributed
data-intensive applications. Awan allows resource sharing between multiple
computing frameworks while enabling high locality scheduling within each
framework. Our experiments with the Nebula Edge Cloud on PlanetLab show that
Awan achieves up to a 28% increase in locality scheduling which reduces
the average job turnaround time by approximately 20% compared to existing
cluster management mechanisms.
Jonathan, Albert; Chandra, Abhishek; Weissman, Jon.
Awan: Locality-aware Resource Manager for Geo-distributed Data-intensive Applications.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.