Scalable compiler optimizations for improving the memory system performance in multi- and many-core processors

Loading...
Thumbnail Image

Persistent link to this item

Statistics
View Statistics

Journal Title

Journal ISSN

Volume Title

Title

Scalable compiler optimizations for improving the memory system performance in multi- and many-core processors

Published Date

2014-09

Publisher

Type

Thesis or Dissertation

Abstract

The last decade has seen the transition from unicore processors to their multi-core (and now many-core) counterparts. This transition has brought about renewed focus on compiler developers to extract performance from these parallel processors. In addition to extracting parallelism, another important responsibility of a parallelizing (or optimizing) compiler is to improve the memory system performance of the source program. This is particularly important because the multi-cores have accentuated the memory-wall and the bandwidth-wall.In this thesis, we identify three key challenges facing the compiler developers on current processors. These include,(1) the diverse set of microarchitectures existent at any time, and more importantly, the changes in micrarchitecture between generations. (2) Poor show of compilers in real applications that contain large scope of statements amenable for optimization. (3) Unscalability of compilers - this is a traditional limitation of compilers where the compilers choose to optimize small scopes to contain the compile time and memory requirement, and thus loose optimization opportunities.In this thesis, we make the following contributions to address the above challenges.(1) We revisit three compiler optimizations (loop tiling and loop fusion for enhancing temporal locality and data prefetching for hiding memory latency) for improving memory (and parallel) performance in light of the various recent advances in microarchitecture, including deeper memory hierarchy, the multithreading technology, the (short-vector) SIMDization technology, and hardware prefetching, and propose generic algorithms implementable in production compilers for a range of processors.(2) We propose wise heuristics in a cost model to choose good statements to fuse, and also improve dependence analysis to not loose critical fusion opportunity in application programs when it exists.(3) The final contribution of this thesis is a solution to the unscalability problem. Based on program semantics, we devise a way to represent the entire program with much fewer representative statements and dependences, leading to significantly improved compile time and memory requirement for compilation. Thus, real applications can now be optimized not only efficiently, but at very low overhead.

Description

University of Minnesota Ph.D. dissertation. September 2014. Major: Computer Science. Advisor: Pen-Chung Yew. 1 computer file (PDF); x, 158 pages.

Related to

Replaces

License

Collections

Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Other identifiers

Suggested citation

Mehta, Sanyam. (2014). Scalable compiler optimizations for improving the memory system performance in multi- and many-core processors. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/168274.

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.