|
Just-In-Time Java Compilation for the Itanium Processor Tatiana Shpeisman
|
tarix | 07.11.2018 | ölçüsü | 177,5 Kb. | | #78944 |
|
Tatiana Shpeisman Guei-Yuan Lueh Ali-Reza Adl-Tabatabai Intel Labs
Introduction Itanium processor is statically scheduled machine - Aggressive compiler techniques to extract ILP
Just-In-Time (JIT) compiler must be fast - Must consider time & space efficiency of optimizations
- Balance compilation time with code quality
- Use heuristics for modeling micro architecture
- Leverage semantics and meta data of JVM
Outline Introduction Compiler overview Register allocation Code scheduling Other optimizations Conclusions
Compiler Structure
Compilation time vs. code quality tradeoff IPF architecture has large register files - 128 integer, 128 floating-point, 64 predicate, 8 branch
- Register Stack Engine (RSE) provides 96 stack registers to each procedure
Use linear scan register allocation - “Linear Scan Register Allocation” by Massimiliano Poletto and Vivek Sarkar
Coalescing Algorithm Coalesce v and t in v = t iff - Live interval of t ends at v = t
- Live interval of t does not intersect with live range of v
Requires one additional reverse pass over IR
Coalescing Speedup
Code Scheduling Scheduling unit is extended basic block - Middle exits are due to run-time exceptions
-
Type-based memory disambiguation Use JVM meta data to disambiguate memory locations - Type
- Integer, floating-point, object reference …
- Kind
- Object field, array element, virtual table address …
- Field id
- putfield #10 vs. putfield #15
Type-Based Disambiguation
Exception Dependencies Naive approach - Exception checks end basic blocks
Our approach - Instruction depends on exception check iff
- Its destination is live at the exception handler, or
- It is an exception check for different exception type
- It is a memory reference that may be guarded by check
Exception Dependencies
IPF Architecture Execution (functional) unit type – M, I, F, B Instruction (syllable type) – M, A, I, F, B, IL Bundles, templates - .mii .mi;;i .mil .mmi .m;;mi .mfi .mmf .mib .mbb .bbb .mmb .mfb
Instruction group – no WAR, WAW with some exceptions
Template Selection Pack instructions into bundles - Choose slot for each instruction
- Insert NOP instructions
- Assign instructions to functional units
Problem: Resource over subscription Inaccurate bypass latencies
Algorithm Greedy slot assignment Sort instruction by syllable type
Template Selection Heuristics
Bypass Latency Accuracy
Other optimizations Predication - Profitability depends on a benchmark
- Performance variations within 2%
Branch hints - Up to 50% speedup from using branch hints
Sign-extension elimination - 1% potential gain for our compiler
Conclusions Light-weight optimizations techniques for Itanium Considering micro architecture is important Language semantics helps to improve ILP - Type-based memory disambiguation
- Exception dependency elimination
Dostları ilə paylaş: |
|
|