Data dissemination for distributed computing.

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Data dissemination for distributed computing.

Published Date

2010-02

Publisher

Type

Thesis or Dissertation

Abstract

Large-scale distributed systems provide an attractive scalable infrastructure for network applications. However, the loosely-coupled nature of this environment can make data access unpredictable, and in the limit, unavailable. This thesis strives to provide predictability in data access for data-intensive computing in large-scale computational infrastructures. A key requirement for achieving predictability in data access is the ability to estimate network performance for data transfer so that computation tasks can take advantage of the estimation in their deployment or data source selection. This thesis develops a framework called OPEN (Overlay Passive Estimation of Network Performance) for scalable network performance estimation. OPEN provides an estimation of end-to-end accessibility for applications by utilizing past measurements without the use of explicit probing. Unlike existing passive approaches, OPEN is not restricted to pairwise or a single network in utilizing historical information; instead, it shares measurements between nodes without any restrictions. As a result, it achieves n2 estimations by O(n) measurements. In addition, this thesis considers data dissemination in two specific environments. First, we consider a parallel data access environment in which multiple replicated servers can be utilized to download a single data file in parallel. To improve both performance and fault tolerance, we present a new parallel data retrieval algorithm and explore a broad set of resource selection heuristics. Second, we consider collective data access in applications for which group performance is more important than individual performance. In this work, we employ communication makespan as a group performance metric and propose server selection heuristics to maximize collective performance.

Description

University of Minnesota Ph.D. dissertation. February 2010. Major: Computer Science. Advisors: Prof. Jon B. Weissman, Prof. Abhishek Chandra. 1 computer file (PDF); x, 129 pages. Ill. (some col.)

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Kim, Jinoh. (2010). Data dissemination for distributed computing.. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/59575.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.