With the continually accelerating growth of data, the performance of storage systems is increasingly becoming a bottleneck to improving overall system performance. Many applications, such as transaction processing systems, weather forecasting, large-scale scientific simulations, and on-demand services are limited by the performance of the underlying storage systems. The limited bandwidth, high power consumption, and low reliability of widely used magnetic disk-based storage systems impose a significant hurdle in scaling these applications to satisfy the increasing growth of data. These limitations and bottlenecks are especially acute for large-scale high-performance computing systems.
Flash memory is an emerging storage technology that shows tremendous promise to compensate for the limitations of current storage devices. Flash memory's relatively high cost, however, combined with its slow write performance and limited number of erase cycles requires new and innovative solutions to integrate flash memory-based storage devices into a high-performance storage hierarchy. The first part of this thesis develops new algorithms, data structures, and storage architectures to address the fundamental issues that limit the use of flash-based storage devices in high-performance computing systems. The second part of the thesis demonstrates two innovative applications of the flash-based storage.
In particular, the first part addresses a set of fundamental issues including new write caching techniques, sampling-based RAM-space efficient garbage collection scheme, and writing strategies for improving the performance of flash memory for write-intensive applications. This effort will improve the fundamental understanding of flash memory, will remedy the major limitations of using flash-based storage devices, and will extend the capability of flash memory to support many critical applications. On the other hand, the second part demonstrates how flash memory can be used to speed up server applications including Bloom Filter and online deduplication system. This effort will use flash-aware data structures and algorithms, and will show innovative uses of flash-based storage.