Park, DongchulDebnath, BiplobNamJin, YoungDuHung-Chang, DavidKim, YoungkyunKim, Youngchul2020-09-022020-09-022011-04-20https://hdl.handle.net/11299/215856Hot data identification is an issue of paramount importance in flash-based storage devices since it has a great impact on their overall performance as well as retains a big potential to be applicable to many other fields. However, it has been least investigated. In this paper, we propose a novel on-line hot data identification scheme named HotDataTrap. The main idea is to maintain a working set of potential hot data items in a cache based on a sampling approach. This sampling scheme enables HotDataTrap to early discard some of the cold items so that it can reduce runtime overheads and a waste of memory spaces. Moreover, our two-level hierarchical hash indexing scheme helps HotDataTrap directly look up a requested item in the cache and save a memory space further by exploiting spatial localities. Both our sampling approach and hierarchical hash indexing scheme empower HotDataTrap to precisely and efficiently identify hot data even with a very limited memory space. Our extensive experiments with various realistic workloads demonstrate that our HotDataTrap outperforms the state-of-the-art scheme by an average of 335% and our two-level hash indexing scheme considerably improves further HotDataTrap performance up to 50.8%.en-USHotDataTrap: A Sampling-based Hot Data Identification Scheme for Flash MemoryReport