Ma, Cong2019-02-122019-02-122018-10https://hdl.handle.net/11299/201684University of Minnesota Ph.D. dissertation October . 2018. Major: Electrical Engineering. Advisor: David Lilja. 1 computer file (PDF); xii, 86 pages.As CMOS technology starts to face serious scaling and power consumption issues, emerging beyond-CMOS technologies draw substantial attention in recent years. Spintronic device, one of the most promising CMOS alternatives, with smaller size and low standby power consumption, fits the needs of the trending mobile and IoT devices. Spin-Transfer Torque-MRAM (STT-MRAM) with comparable read latency with SRAM and All-spin logic (ASL) capable of implementing pure spin-based circuit are the potential candidates to replace CMOS memory and logic devices. However, spintronic memory continues to require higher write energy, presenting a challenge to memory hierarchy design when energy consumption is a concern. This motivates the use of STT-MRAM for the first level caches of a multicore processor to reduce energy consumption without significantly degrading the performance. The large STT-MRAM first-level cache implementation saves leakage power. And the use of small level-0 cache regains the performance drop due to the long write latency of STT-MRAM. This combination reduces the energy-delay product by 65% on average compared to CMOS baseline. All-spin logic suffers from random bit flips that significantly impacts the Boolean logic reliability. Stochastic computing, using random bit streams for computations, has shown low hardware cost and high fault-tolerance compared to the conventional binary encoding. It motivates the use of ASL in stochastic computing to take advantage of its simplicity and fault tolerance. Finite-state machine (FSM), a sequential stochastic computing element, can compute complex functions including the exponentiation and hyperbolic tangent functions more efficiently, but it suffers from long calculation latency and autocorrelation issues. A parallel implementation scheme of FSM is proposed to use an estimator and a dispatcher to directly initialize the FSM to the steady state. It shows equivalent or better results than the serial implementation with some hardware overhead. A re-randomizer that uses an up/down counter is also proposed to solve the autocorrelation issue.enCache HierarchyComputer ArchitectureSpintronic deviceStochastic computingSTT-MRAMThe Design of Spintronic-based Circuitry for Memory and Logic Units in Computer SystemsThesis or Dissertation