Browsing by Author "Chen, Howard"
Now showing 1 - 2 of 2
- Results Per Page
- Sort Options
Item Performance of Runtime Optimization on BLAST(2004-10-15) Das, Abhinav; Lu, Jiwei; Chen, Howard; Kim, Jinpyo; Yew, Pen-Chung; Hsu, Wei-Chung; Chen, Dong-yuanOptimization of a real world application BLAST is used to demonstrate the limitations of static and profile-guided optimizations and to highlight the potential of runtime optimization systems. We analyze the performance profile of this application to determine performance bottlenecks and evaluate the effect of aggressive compiler optimizations on BLAST. We find that applying common optimizations (e.g. O3) can degrade performance. Profile guided optimizations do not show much improvement across the board, as current implementations do not address critical performance bottlenecks in BLAST. In some cases, these optimizations lower performance significantly due to unexpected secondary effects of aggressive optimizations. We also apply runtime optimization to BLAST using the ADORE framework. ADORE speeds up some queries by as much as 58% using data cache prefetching. Branch mispredictions can also be significant for some input sets. Dynamic optimization techniques to improve branch prediction accuracy are described and examined for the application. We find that the primary limitation to the application of runtime optimization for branch misprediction is the tight coupling between data and dependent branch. With better hardware support for influencing branch prediction, a runtime optimizer may deploy optimizations to reduce branch misprediction stalls.Item Phase Locality Detection Using a Branch Trace Buffer for Efficient Profiling in Dynamic Optimization(2002-02-07) Hsu, Wei-Chung; Chen, Howard; Yew, Pen-Chung; Chen, Dong-yuanAbstract Efficient profiling is a major challenge for dynamic optimization because the profiling overhead contributes to the total execution time. In order to identify program hot spots for runtime optimization, the profiler in a dynamic optimizer must detect new execution phases and subsequent phase changes. Current profiling approaches used in prototype dynamic optimizers are interpretation or instrumentation based. They have eithervery high overhead, or generate poor quality profiles. We use the branch trace buffer and the hardware performance monitoring features provided in the IA -64 architecture to detect new execution phases and phase changes. The branch trace buffer records the last few branch instructions executed before an event based interrupt is generated. Using the branch trace buffer, our profiler continuously samples execution paths leading to critical performance events, such as cache misses and pipeline stalls. A set of frequently executed traces and their respective performancecharacteristics within a time interval are considered as an execution phase. Since such phases tend to repeat over time, a dynamic optimizer can exploit the phase locality to drive optimization. We check for new phases and phase changes at the end of each time interval. Although a new phase or a changed phase can be a candidate for optimization, our phase detector delays the invocation of the optimizer until a relatively stable phase is detected. We report the effectiveness of various phase detection methods using Spec2000int as the benchmark. Our results indicate branch trace based phase detection can be suitable for dynamic binary optimizers.