Improvements in semiconductor technology and computer architecture have led to the proliferation of multicore and many-core processors. In order to improve the performance of multithreaded applications on multicore processors, hardware vendors have recently included support for transactional execution in the form of Hardware Transactional Memory (HTM) and Hardware Lock Elision (HLE). Under transactional execution, threads can speculatively execute in parallel and rely on runtime hardware to detect memory conflicts and rollback/replay execution if required. If an application does not encounter frequent memory conflicts among threads, then transactional execution can result in better performance, as compared to using mutex locks, due to the increased parallelism. Although primarily intended to improve multithreaded software performance, the introduction of hardware support for transactional execution presents exciting new avenues for addressing crucial research problems in a wider range of software. This thesis presents two novel applications of transactional execution to address performance and correctness challenges in software. Most state-of-the-art processors implement relaxed memory consistency models in an attempt to extract more program performance. Different processor vendors implement different memory consistency models with varying memory ordering guarantees. The discrepancy among memory consistency models of different instruction set architectures (ISAs) presents a correctness problem in a cross-ISA system emulation environment. It is possible for the host system to re-order memory instructions in the guest application in a way that violates the guest memory consistency model. In order to guarantee correct emulation, a system emulator must insert special memory fence instructions as required. Transactional execution ensures that memory instructions within concurrent transactions appear to execute atomically and in isolation. Consequently, transactional semantics offers an alternative means of ordering instructions at a coarse-grained transaction level, and the implementation of hardware support for transactional execution provides an alternative to memory fences. This thesis tackles the correctness problem of memory consistency model emulation in system emulators by leveraging transactional execution support. Extracting sufficient parallelism from sequential applications is paramount to improve their performance on multicore processors. Unfortunately, automatic parallelizing compilers are ineffective on a large class of sequential applications with ambiguous memory dependences. In the past, Thread-Level Speculation (TLS) has been proposed as a solution to speculatively parallelize sequential applications. TLS allows code segments from a sequential application to speculatively execute in parallel, and relies on runtime hardware support to detect memory conflicts and rollback/replay execution. No current processor implements hardware support required for TLS, however, the transactional execution support available in recent processors provides some of the features required to implement TLS. In this thesis, we propose software techniques to realize TLS by leveraging transactional execution support available on multicore processors. We evaluate the proposed TLS design and show that TLS improves the overall performance of a set of sequential applications, which cannot be parallelized by traditional means, by up to 11\% as compared to their sequential versions.
University of Minnesota Ph.D. dissertation.May 2015. Major: Computer Science. Advisor: Antonia Zhai. 1 computer file (PDF); 105 pages.
Leveraging Hardware Support for Transactional Execution To Address Correctness and Performance Challenges in Software.
Retrieved from the University of Minnesota Digital Conservancy,
Content distributed via the University of Minnesota's Digital Conservancy may be subject to additional license and use restrictions applied by the depositor.