The Applications of Workload Characterization in The World of Massive Data Storage
2015-08
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
The Applications of Workload Characterization in The World of Massive Data Storage
Authors
Published Date
2015-08
Publisher
Type
Thesis or Dissertation
Abstract
The digital world is expanding exponentially because of the growth of various applications in domains including scientific fields, enterprise environment and internet services. Importantly, these applications have drastically different storage requirements including parallel I/O performance and storage capacity. Various technologies have been developed in order to better satisfy different storage requirements. I/O middleware software, parallel file systems and storage arrays are developed to improve I/O performance by increasing I/O parallelism at different levels. New storage media and data recording technologies such as shingled magnetic recording (SMR) are also developed to increase the storage capacity. This work focuses on improving existing technologies and designing new schemes based on I/O workload characterizations in corresponding storage environments. The contributions of this work can be summarized into four pieces, two on improving parallel I/O performance and two on increasing storage capacity. First, we design a comprehensive parallel I/O workload characterization and generation framework (called PIONEER) which can be used to synthesize a particular parallel I/O workload with desired I/O characteristics or precisely emulate a High Performance Computing (HPC) application of interest. Second, we propose a non-intrusive I/O middleware (called IO-Engine) to automatically improve a given parallel I/O workload in Lustre which is a widely used HPC or parallel I/O system. IO-Engine can explore the correlations between different software layers in the deep I/O path, as well as workload patterns at runtime to transparently transform the workload patterns and tune related I/O parameters in the system. Third, we design several novel static address mapping schemes for shingled write disks (SWDs) to minimize the write amplification overhead in hard drives adopting SMR technology. Fourth, we propose a track-level shingled translation layer (T-STL) for SWDs with hybrid update strategy (in-place update plus out-of-place update). T-STL uses dynamic address mapping scheme and performs garbage collection operations by migrating selected disk tracks. This scheme can provider larger storage capacity and better overall performance with the same effective storage percentages when compared to the static address mapping schemes.
Keywords
Description
University of Minnesota Ph.D. dissertation. August 2015. Major: Computer Science. Advisor: David Du. 1 computer file (PDF); x, 116 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
He, Weiping. (2015). The Applications of Workload Characterization in The World of Massive Data Storage. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/175456.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.