Scalable compiler optimizations for improving the memory system performance in multi- and many-core processors
2014-09
Loading...
View/Download File
Persistent link to this item
Statistics
View StatisticsJournal Title
Journal ISSN
Volume Title
Title
Scalable compiler optimizations for improving the memory system performance in multi- and many-core processors
Authors
Published Date
2014-09
Publisher
Type
Thesis or Dissertation
Abstract
The last decade has seen the transition from unicore processors to their multi-core (and now many-core) counterparts. This transition has brought about renewed focus on compiler developers to extract performance from these parallel processors. In addition to extracting parallelism, another important responsibility of a parallelizing (or optimizing) compiler is to improve the memory system performance of the source program. This is particularly important because the multi-cores have accentuated the memory-wall and the bandwidth-wall.In this thesis, we identify three key challenges facing the compiler developers on current processors. These include,(1) the diverse set of microarchitectures existent at any time, and more importantly, the changes in micrarchitecture between generations. (2) Poor show of compilers in real applications that contain large scope of statements amenable for optimization. (3) Unscalability of compilers - this is a traditional limitation of compilers where the compilers choose to optimize small scopes to contain the compile time and memory requirement, and thus loose optimization opportunities.In this thesis, we make the following contributions to address the above challenges.(1) We revisit three compiler optimizations (loop tiling and loop fusion for enhancing temporal locality and data prefetching for hiding memory latency) for improving memory (and parallel) performance in light of the various recent advances in microarchitecture, including deeper memory hierarchy, the multithreading technology, the (short-vector) SIMDization technology, and hardware prefetching, and propose generic algorithms implementable in production compilers for a range of processors.(2) We propose wise heuristics in a cost model to choose good statements to fuse, and also improve dependence analysis to not loose critical fusion opportunity in application programs when it exists.(3) The final contribution of this thesis is a solution to the unscalability problem. Based on program semantics, we devise a way to represent the entire program with much fewer representative statements and dependences, leading to significantly improved compile time and memory requirement for compilation. Thus, real applications can now be optimized not only efficiently, but at very low overhead.
Description
University of Minnesota Ph.D. dissertation. September 2014. Major: Computer Science. Advisor: Pen-Chung Yew. 1 computer file (PDF); x, 158 pages.
Related to
Replaces
License
Collections
Series/Report Number
Funding information
Isbn identifier
Doi identifier
Previously Published Citation
Other identifiers
Suggested citation
Mehta, Sanyam. (2014). Scalable compiler optimizations for improving the memory system performance in multi- and many-core processors. Retrieved from the University Digital Conservancy, https://hdl.handle.net/11299/168274.
Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.