Zhang, Baoquan2021-04-122021-04-122020-01https://hdl.handle.net/11299/219314University of Minnesota Ph.D. dissertation.January 2021. Major: Computer Science. Advisor: David Du. 1 computer file (PDF); viii, 126 pages.In modern data centers, the volume of data has grown to an enormous size with an incredible speed due to the flourish of the Internet, mobile network, and the Internet of Things (IoT). Storage systems play a critical role under different scenarios, e.g., machine learning pipeline, interactive data analysis, data storage service, etc. Applications have to meet with very high requirements in the aspects of performance, capacity, and reliability. However, the I/O performance of storage systems suffer from the long-latency of storage devices. Besides, the data area density of storage devices has reached a bottleneck. Thus, it becomes difficult to increase the capacity of storage systems further. At last, silent data corruption happens more frequently than we expect. Traditional methods, e.g., replica, erasure code, etc., are not sufficient to ensure data reliability anymore. To address these challenges of performance, capacity, and data reliability in storage systems, storage vendors have proposed new storage technologies/devices. Firstly, Non-Volatile Memory (NVM) is a persistent memory that provides memory-speed data persistence and byte-addressable data accesses. Secondly, Shingled Magnetic Recording (SMR) is a promising layout to increase data area density further with existing magnetic recording technologies. At last, T10 Protection Information (T10-PI) drives are proposed against data corruption. However, current storage systems need to be optimized or even redesigned to leverage the advantages of these new storage technologies/devices. This thesis introduces four complemented research topics targeting designing new storage systems using NVM, SMR drives, PI-capable drives, and hybrid systems with NVM and storage devices, respectively. NVLSM is a key-value store using Log-Structured Merge Tree (LSM-Tree) on NVM systems. Idler is a mechanism to control the I/O workload to minimize the tail response time of an SMR drive. Idler artificially induces idle cleanings to avoid expensive blocking cleanings. DIX-aware RAID improves the data integrity in Linux software RAID using T10-PI against any data corruption during data transmission and persistence. Finally, PMDB is a new key-value store on systems with both NVM and traditional storage devices to achieve performance and capacity simultaneously.enNVMSMRStorage SystemsT10 PIStorage System Designs with Emerging Storage TechnologiesThesis or Dissertation