Data dissemination for distributed computing.
2010-02
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Data dissemination for distributed computing.
Authors
Published Date
2010-02
Publisher
Type
Thesis or Dissertation
Abstract
Large-scale distributed systems provide an attractive scalable infrastructure for network
applications. However, the loosely-coupled nature of this environment can make
data access unpredictable, and in the limit, unavailable. This thesis strives to provide
predictability in data access for data-intensive computing in large-scale computational
infrastructures.
A key requirement for achieving predictability in data access is the ability to estimate
network performance for data transfer so that computation tasks can take advantage
of the estimation in their deployment or data source selection. This thesis develops
a framework called OPEN (Overlay Passive Estimation of Network Performance) for
scalable network performance estimation. OPEN provides an estimation of end-to-end
accessibility for applications by utilizing past measurements without the use of explicit
probing. Unlike existing passive approaches, OPEN is not restricted to pairwise or
a single network in utilizing historical information; instead, it shares measurements
between nodes without any restrictions. As a result, it achieves n2 estimations by O(n)
measurements.
In addition, this thesis considers data dissemination in two specific environments.
First, we consider a parallel data access environment in which multiple replicated servers
can be utilized to download a single data file in parallel. To improve both performance
and fault tolerance, we present a new parallel data retrieval algorithm and explore a
broad set of resource selection heuristics. Second, we consider collective data access
in applications for which group performance is more important than individual performance.
In this work, we employ communication makespan as a group performance
metric and propose server selection heuristics to maximize collective performance.
Description
University of Minnesota Ph.D. dissertation. February 2010. Major: Computer Science. Advisors: Prof. Jon B. Weissman, Prof. Abhishek Chandra. 1 computer file (PDF); x, 129 pages. Ill. (some col.)
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Kim, Jinoh. (2010). Data dissemination for distributed computing.. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/59575.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.