Browsing by Author "Yang, Jinfeng"
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
Item High Performance Storage System Design Using Emerging Storage Technologies(2021-12) Yang, JinfengIn the past few decades, data volume increases exponentially. Smart devices, social media, and e-business generate an extremely amount of data everyday. While big data is promising and has been successfully applied in many areas such as machine learning, financial market, and healthcare, it still faces many problems. One main challenge for large scale data storage system is how to store and retrieve data efficiently. In this thesis, we approach the above challenge by developing high performance storage systems using emerging storage technologies (i.e., non-volatile memory and non-volatile memory block device). First, we start by characterizing an emerging storage device - non-volatile memory (NVM) block device. With replacing the NAND flash chip inside the Solid State Drive (SSD) into NVM media, NVM block device delivers substantial performance improvements compared to NAND-based storage systems. However, its performance characteristics have not been well studied. In this study, we carry out intensive experiments and propose multiple custom-design micro-benchmarks to extract the intrinsic performance behaviors of the NVM block device, including the basic I/O performance behavior, advanced interleaving technology, performance consistency under highly intensive I/O workload, the influence of unaligned request size, the elimination of write-driven garbage collection, read disturb issues, and the tail latency problem. The performance is compared to that of a conventional NAND SSD to indicate the performance difference of the NVM block device in each scenario. In addition, by using an online analytical benchmark, a database system's performance is studied on our target storage devices to quantify the potential benefits of the NVM block device to a real application. Finally, the performance impact of hybrid NVM block devices and NAND SSD storage systems on a database application is investigated. Second, based on the understanding to the performance characteristics of NVM block device and the bottlenecks of database system we studied on the above, we present a hybrid storage database system, called HeuristicDB, that uses a non-volatile memory (NVM) block device as an extension of the database buffer pool. To consider the unique performance behavior of NVM block devices and the block-level characteristics of database requests, a set of heuristic rules that associate database (block) requests with the appropriate quality of service for the purpose of caching priority are proposed. To support the system implementation of the proposed rules, four programs, including a query profile queue, an eviction and demotion (EV) program, a table access pattern detector, and a page placement controller are developed. Using online analytical processing (OLAP) and online transactional processing (OLTP) benchmarks, both trace-based examination and system implementation on MySQL are carried out to evaluate the effectiveness of the proposed design. The experimental results show HeuristicDB delivers up to 75% higher performance than existing systems. Third, in the setting of NVM is going to replace DRAM to server as a main memory in the future, we introduce a machine learning based cache replacement algorithm, named ExpCache, to improve the NVM-based system performance. By considering the non-volatility characteristic of the NVM devices, we split the whole NVM into two caches, including a read cache and a write cache, for retaining different types of requests. The pages in each cache are managed by both LRU and LFU policies for balancing the recency and frequency of workloads. The online Expert machine learning algorithm is responsible for selecting a proper policy to evict a page from one of the caches based on the access patterns of workloads. Moreover, a read-write discriminated eviction program is developed to eliminate the number of dirty pages written back to the storage. In experimental results, the proposed ExpCache outperforms previous studies in terms of hit ratio and the number of dirty pages written back to storage. In summary, a NVM block device performance characterization, a hybrid storage database system, and an online learning based cache replacement algorithm for NVM are proposed in this thesis and allow large scale data storage system to deliver high performance.