Browsing by Subject "In-memory Computing"
Now showing 1 - 1 of 1
- Results Per Page
- Sort Options
Item Non-volatile In-memory Computing for Large Scale Data-Intensive Workloads: Challenges and Opportunities(2021-12) Chowdhury, ZamshedThe application(domain)s that depend on the large amount of data for solving problems, e.g., genome sequence analysis, graph analytics, machine learning etc., suffer from growing overhead of data communication between physically separate logic (i.e., compute) and memory elements in conventional von Neumann computing. The recent progress in processing(/computing)-in-memory (PIM/CIM) or simply, in-memory computing addresses data communication overhead in these applications by fusing compute capability with memory where the data reside– thereby achieving reduced energy consumption, and higher application throughput due to access to the higher internal bandwidth of the memory substrate as compared to the off-chip bandwidth.In this thesis, we focus on the architecture- and application-level characterizations of PIM architecture, Computational RAM (CRAM) in particular, for large scale data-intensive workloads–in terms of opportunities and challenges. We demonstrate the efficacy of CRAM in reducing the communication bottleneck of genomic sequence analysis, as a representative application domain due to its importance and inherent characteristics that are suitable for PIM-based implementation, by designing various CRAM-based Hardware (HW) accelerators. The designs cover all architectural aspects such as data layout, spatio-temporal scheduling of compute, system integration etc. First, we introduce an in-memory accelerator architecture, BWA-CRAM, for DNA sequence alignment by direct mapping of state-of-the-art Burrows–Wheeler Aligner algorithm on CRAM. This architecture outperforms corresponding software implementation in terms of throughput and energy efficiency, even under conservative assumptions. Next, we improve the performance of DNA sequence (pre-)alignment (and other similar, generic pattern matching applications) through HW/SW co-design and introduce SpinPM, a novel high-density, reconfigurable spintronic in-memory pattern matching substrate based on CRAM with Spin-Orbit-Torque (SOT)– specifically Spin-Hall-Effect (SHE) MTJ devices; and demonstrate the performance benefit SpinPM can achieve over conventional and near-memory processing systems. Subsequently, we present CRAM-Seq, an accelerator for RNA-Seq abundance quantification based on CRAM. Through HW/SW co-design, we demonstrate that CRAM-Seq outperforms a commonly used state-of-the-art software abundance quantification algorithm, Kallisto, in terms of throughput and energy efficiency. We introduce Content Addressable Memory or CAM, which is very efficient in large scale pattern matching, functionality in CRAM, next. We present CAMeleon- a novel compute substrate that leverages the high energy efficiency benefit of CRAM, and is capable of satisfying very stringent hardware resource (area) budget in embedded/edge computing applications, e.g., handheld sequencing device. CAMeleon performs CAM operations more energy-efficiently while consuming less/similar area, and supports logic and memory functions beyond CAM operations on demand through reconfiguration, as compared to conventional CAM-only designs based on SRAM and emerging memory technologies (such as STT-MTJ, ReRAM and PCM). Finally, we study the impact on applications’ reliability due to mapping on a PIM substrate, focusing on PIM architectures that perform logic operations directly within memory arrays, in-situ, obviating any need for data transfers (even to and from the array periphery), e.g., CRAM. Here we (i) quantitatively characterize gate–flip errors, an acute class of functional errors specific to such PIM systems, where, due to parametric variations, a logic gate can behave as another; and (ii) analyze to what extent algorithmic noise tolerance can mask gate-flips.