Efficient and Reliable In-Memory Computing Systems Using Emerging Technologies: From Device to Algorithm

Zabihi, Masoud2022-01-042022-01-042021-11https://hdl.handle.net/11299/225913University of Minnesota Ph.D. dissertation. November 2021. Major: Electrical/Computer Engineering. Advisor: Sachin Sapatnekar. 1 computer file (PDF); xi, 140 pages.Big data applications are memory-intensive, and the cost of bringing data from the memory to the processor involves large overheads in energy and processing time. This has driven the push towards specialized accelerator units that can perform computations close to where the data is stored. Two approaches have been proposed in the past: (1) near-memory computing places computational units at the periphery of memory for fast data access, and (2) true in-memory computing uses the memory array to perform computations through simple reconfigurations. Although there has been a great deal of recent interest in the area of in-memory computing, most solutions that are purported to fall into this class are really near-memory processors that perform computation near the edge of memory arrays/subarrays rather than inside it. This thesis discusses several years of effort in developing various true in-memory computation platforms. These computational paradigms are designed using different emerging non-volatile memory technologies such as spin transfer torque (STT) magnetic tunnel junction (MTJ), spin Hall effect (SHE) MTJ, and phase change memory (PCM) device. The proposed platforms in this thesis effectively eliminate the energy and delay overhead associated with data communication. Our approach is digital, unlike prior analog-like in-memory/near-memory solutions, which provides more robustness to process variations, particularly in immature technologies than analog schemes. The thesis covers alternatives at the technology level, followed by a description of how the in-memory computing array is designed, using the basic non-volatile unit (such as MTJ) and some switches, to function both as a memory and a computational unit. This array is then used to build gates and arithmetic units by appropriately interconnecting memory cells, allowing high degrees of parallelism. Next, we show how complex arithmetic operations can be performed through appropriate scheduling (for adders, multipliers, dot products) and data placement of the operands. We demonstrate how this approach can be used to implement sample applications, such as neuromorphic inference engine and a 2D convolution, presenting results that benchmark the performance of these CRAMs against near-memory computation platforms. The performance gains can be attributed to (a) highly efficient local processing within the memory, and (b) high levels of parallelism in rows of the memory. For our in-memory computing platforms, wire resistances and variations are a substantial source of non-ideality that must be taken into account during the implementations. To ensure the electric correctness of implementations, we have developed different frameworks to analyze the parasitic effects of wires based on actual layout considerations. We have demonstrated that interconnect parasitics have a significant effect on the performance of the in-memory computing system and have developed a comprehensive model for analyzing this impact. Using this methodology, we have developed guidelines for the physical parameters such as array size and numbers of rows and columns.enEfficient ComputingIn-Memory ComputingMagnetic Tunnel JunctionNeuromorphic ComputingPost CMOS ComputingSpintronicsEfficient and Reliable In-Memory Computing Systems Using Emerging Technologies: From Device to AlgorithmThesis or Dissertation