Browsing by Author "Park, Dongchul"
Now showing 1 - 4 of 4
- Results Per Page
- Sort Options
Item CFTL: A Convertible Flash Translation Layer with Consideration of Data Access Patterns(2009-09-14) Park, Dongchul; Debnath, Biplob; DuHung-Chang, DavidNAND flash memory-based storage devices are increasingly adopted as one of the main alternatives for magnetic disk drives. The flash translation layer (FTL) is a software/hardware interface inside NAND flash memory, which allows existing disk-based applications to use it without any significant modifications. Since FTL has a critical impact on the performance of NAND flash-based devices, a variety of FTL schemes have been proposed to improve their performance. However, existing FTLs perform well for either a read intensive workload or a write intensive workload, not for both of them due to their static address mapping schemes. To overcome this limitation, in this paper, we propose a novel FTL addressing scheme named Convertible Flash Translation Layer (CFTL, for short). CFTL is adaptive to data access patterns so that it can dynamically switch the mapping of a data block to either read-optimized or write-optimized mapping scheme in order to fully exploit the benefits of both schemes. By judiciously taking advantage of both schemes, CFTL resolves the intrinsic problems of the existing FTLs. In addition to this convertible scheme, we propose an efficient caching strategy so as to considerably improve the CFTL performance further with only a simple hint. Consequently, both of the convertible feature and caching strategy empower CFTL to achieve good read performance as well as good write performance. Our experimental evaluation with a variety of realistic workloads demonstrates that the proposed CFTL scheme outperforms other FTL schemes.Item Hot and cold data identification: applications to storage devices and systems.(2012-08) Park, DongchulHot data identification is an issue of paramount importance in storage systems since it has a great impact on their overall performance as well as retains a big potential to be applicable to many other fields. However, it has been least investigated. In this dissertation, I propose two novel hot data identification schemes: (1) multiple bloom filter-based scheme and (2) sampling-based scheme. Then I apply them to the storage device and system such as Solid State Drives (SSD) and data deduplication system. In the multiple bloom filter-based hot data identification scheme, I adopt multiple bloom filters and hash functions to efficiently capture finer-grained recency as well as frequency information by assigning a different recency coverage to each bloom filter. The sampling-based scheme employs a sampling mechanism so that it early discards some of the cold items to reduce runtime overheads and a waste of memory spaces. Both hot data identification schemes empower each scheme to precisely and efficiently identify hot data in storage with less system resources. Based on these approaches, I choose two storage fields as their applications: NAND flash-based SSD design and data deduplication system. Particularly in SSD design, hot data identification has a critical impact on its performance (due to a garbage collection) as well as its life span (due to a wear leveling). To address these issues in SSD design, I propose a new hybrid Flash Translation Layer (FTL) design that is a core part of the SSD design. The proposed FTL (named CFTL) is adaptive to data access patterns with the help of the multiple bloom filter-based hot data identification algorithm. As the other application, I explore a data deduplication storage system. Data dedu- plication (for short, dedupe) is a special data compression technique that has been widely adopted especially in backup storage systems for backup time saving as well as storage saving. Unlike the traditional dedupe research that has focused more on the write performance improvement, I address its read performance aspect. In this section, I newly design a read cache in dedupe storage for a backup application to improve read performance by looking ahead their future references in a moving window with the combination of a hot data identification algorithm. This dissertation addresses the importance of hot data identification in storage areas and shows how it can be effectively applied to them in order to overcome the existing limitations in each storage venue.Item Hot Data Identification for Flash Memory Using Multiple Bloom Filters(2010-10-05) Park, Dongchul; DuHung-Chang, DavidHot data identification can be applied to a variety of fields. Particularly in flash memory, it has a critical impact on its performance (due to garbage collection) as well as its lifespan (due to wear leveling). Although this is an issue of paramount importance in flash memory, it is the least investigated one. Moreover, all existing schemes focus only or mainly on a frequency viewpoint. However, recency factor also must be considered as much importantly as the frequency for hot data identification. In this paper, we propose a novel hot data identification scheme adopting multiple bloom filters to efficiently capture finer-grained recency as well as frequency. In addition to this scheme, we propose a window-based direct address counting (named WDAC) algorithm to approximate an ideal hot data identification as our baseline. Unlike the existing baseline algorithm that cannot appropriately capture recency information due to its exponential batch decay, our WDAC algorithm using a sliding window concept can capture very fine-grained recency information. Our experimental evaluation with diverse realistic workloads including real SSD traces demonstrates that our proposed scheme outperforms the state-of-the-art hot data identification scheme. In particular, our scheme not only consumes less memory (50% less) and requires less computational overhead up to 58%, but also improves its performance up to 65%.Item HotDataTrap: A Sampling-based Hot Data Identification Scheme for Flash Memory(2011-04-20) Park, Dongchul; Debnath, Biplob; NamJin, Young; DuHung-Chang, David; Kim, Youngkyun; Kim, YoungchulHot data identification is an issue of paramount importance in flash-based storage devices since it has a great impact on their overall performance as well as retains a big potential to be applicable to many other fields. However, it has been least investigated. In this paper, we propose a novel on-line hot data identification scheme named HotDataTrap. The main idea is to maintain a working set of potential hot data items in a cache based on a sampling approach. This sampling scheme enables HotDataTrap to early discard some of the cold items so that it can reduce runtime overheads and a waste of memory spaces. Moreover, our two-level hierarchical hash indexing scheme helps HotDataTrap directly look up a requested item in the cache and save a memory space further by exploiting spatial localities. Both our sampling approach and hierarchical hash indexing scheme empower HotDataTrap to precisely and efficiently identify hot data even with a very limited memory space. Our extensive experiments with various realistic workloads demonstrate that our HotDataTrap outperforms the state-of-the-art scheme by an average of 335% and our two-level hash indexing scheme considerably improves further HotDataTrap performance up to 50.8%.