TY - GEN
T1 - Generating physical addresses directly for saving instruction TLB energy
AU - Kadayif, I.
AU - Sivasubramaniam, A.
AU - Kandemir, M.
AU - Kandiraju, G.
AU - Chen, G.
PY - 2002/1/1
Y1 - 2002/1/1
N2 - Power consumption and power density for the Translation Lookaside Buffer (TLB) are important considerations not only in its design, but can have a consequence on cache design as well. This paper embarks on a new philosophy for reducing the number of accesses to the instruction TLB (iTLB) for power and performance optimizations. The overall idea is to keep a translation currently being used in a register and avoid going to the iTLB as far as possible - until there is a page change. We propose four different approaches for achieving this, and experimentally demonstrate that one of these schemes that uses a combination of compiler and hardware enhancements can reduce iTLB dynamic power by over 85% in most cases. These mechanisms can work with different instruction-cache (iLl) lookup mechanisms and achieve significant iTLB power savings without compromising on performance. Their importance grows with higher iLl miss rates and larger page sizes. They can work very well with large iTLB structures, that can possibly consume more power and take longer to lookup, without the iTLB getting into the common case. Further, we also experimentally demonstrate that they can provide performance savings for virtually-indexed, virtually-tagged iLl caches, and can even make physically-indexed, physically-tagged iLl caches a possible choice for implementation.
AB - Power consumption and power density for the Translation Lookaside Buffer (TLB) are important considerations not only in its design, but can have a consequence on cache design as well. This paper embarks on a new philosophy for reducing the number of accesses to the instruction TLB (iTLB) for power and performance optimizations. The overall idea is to keep a translation currently being used in a register and avoid going to the iTLB as far as possible - until there is a page change. We propose four different approaches for achieving this, and experimentally demonstrate that one of these schemes that uses a combination of compiler and hardware enhancements can reduce iTLB dynamic power by over 85% in most cases. These mechanisms can work with different instruction-cache (iLl) lookup mechanisms and achieve significant iTLB power savings without compromising on performance. Their importance grows with higher iLl miss rates and larger page sizes. They can work very well with large iTLB structures, that can possibly consume more power and take longer to lookup, without the iTLB getting into the common case. Further, we also experimentally demonstrate that they can provide performance savings for virtually-indexed, virtually-tagged iLl caches, and can even make physically-indexed, physically-tagged iLl caches a possible choice for implementation.
UR - http://www.scopus.com/inward/record.url?scp=84906809658&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84906809658&partnerID=8YFLogxK
U2 - 10.1109/MICRO.2002.1176249
DO - 10.1109/MICRO.2002.1176249
M3 - Conference contribution
AN - SCOPUS:84906809658
T3 - Proceedings of the Annual International Symposium on Microarchitecture, MICRO
SP - 185
EP - 196
BT - Proceedings - 35th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2002
PB - IEEE Computer Society
T2 - 35th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2002
Y2 - 18 November 2002 through 22 November 2002
ER -