Performance and power comparison of Thread Level Speculation in SMT and CMP architectures

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Performance and power comparison of Thread Level Speculation in SMT and CMP architectures

Published Date

2007-10-30

Publisher

Type

Report

Abstract

As technology advances, microprocessors that support multiple threads of execution on a single chip are becoming increasingly common. Improving the performance of general purpose applications by extracting parallel threads is extremely difficult, due to the complex control flow and ambiguous data dependences that are inherent to these applications. Thread-Level Speculation (TLS) enables speculative parallel execution of potentially dependent threads, and ensures correct execution by providing hardware support to detect data dependence violations and to recover from speculation failures. TLS can be supported on a variety of architectures, among them are Chip MultiProcessors (CMP) and Simultaneous MultiThreading(SMT). While there have been numerous papers comparing the performance and power efficiency of SMT and CMP processors under various workloads, relatively little has been done to compare them under the context of TLS. While CMPs utilize smaller and more power-efficient cores, resource sharing and constructive interference between speculative and non-speculative threads can potentially make SMT more power efficient. Thus, this paper aims to fill this void by extending a CMP and a SMT processor to support TLS, and evaluating the performance and power efficiency of the resulting systems with speculative parallel threads extracted for the SPEC2000 benchmark suite. Both SMT and CMP processors have a large variety of configurations, we choose to conduct our study on two architectures with equal die area and the same clock frequency. Our results show that a SMT processor that supports four speculative threads outperforms a CMP processor that supports the same number of threads, uses the same die area and operates at the same clock frequency by 23% while consuming only 8% more power on selected SPEC2000 benchmarks. In terms of energy-delay product, the same SMT processor is approximately 10% more efficient than the CMP processor.

Keywords

Description

Related to

Replaces

License

Series/Report Number

Technical Report; 07-026

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Packirisamy, Venkatesan; Zhai, Antonia; Hsu, Wei-Chung; Yew, Pen-Chung. (2007). Performance and power comparison of Thread Level Speculation in SMT and CMP architectures. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/215740.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.