TY - GEN
T1 - Machine learning techniques for improved data prefetching
AU - Guttman, Diana
AU - Kandemir, Mahmut Taylan
AU - Arunachalam, Meena
AU - Khanna, Rahul
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/12/9
Y1 - 2015/12/9
N2 - With the advent of teraflop-scale computing on both a single coprocessor and many-core designs, there is tremendous need for techniques to fully utilize the compute power by keeping cores fed with data. Data prefetching has been used as a popular method to hide memory latencies by fetching data proactively before the processor needs the data. Fetching data ahead of time from the memory subsystem into faster caches reduces observable latencies or wait times on the processor end and this improves overall program execution times. We study two types of prefetching techniques that are available on a 61-core Intel Xeon Phi co-processor, namely software (compiler-guided) prefetching and hardware prefetching on a variety of workloads. Using machine learning techniques, we synthesize workload phases and the sequence of phase patterns using raw performance data from hardware counters such as memory bandwidth, miss ratios, prefetches issued, etc. Furthermore, we use performance data from workloads with different impacts and behaviors under various prefetcher settings. Our contribution can help in future prefetching design in the following ways: (1) to identify phases within workloads that have different characteristics and behaviors and help dynamically modify prefetch types and intensities to suit the phase; (2) to manage auto setting of prefetcher knobs without great effort from the user; (3) to influence software and hardware prefetching interaction designs in future processors; and (4) to use valuable insights and performance data in many areas such as power provisioning for the nodes in a large cluster to maximize both energy and performance efficiencies.
AB - With the advent of teraflop-scale computing on both a single coprocessor and many-core designs, there is tremendous need for techniques to fully utilize the compute power by keeping cores fed with data. Data prefetching has been used as a popular method to hide memory latencies by fetching data proactively before the processor needs the data. Fetching data ahead of time from the memory subsystem into faster caches reduces observable latencies or wait times on the processor end and this improves overall program execution times. We study two types of prefetching techniques that are available on a 61-core Intel Xeon Phi co-processor, namely software (compiler-guided) prefetching and hardware prefetching on a variety of workloads. Using machine learning techniques, we synthesize workload phases and the sequence of phase patterns using raw performance data from hardware counters such as memory bandwidth, miss ratios, prefetches issued, etc. Furthermore, we use performance data from workloads with different impacts and behaviors under various prefetcher settings. Our contribution can help in future prefetching design in the following ways: (1) to identify phases within workloads that have different characteristics and behaviors and help dynamically modify prefetch types and intensities to suit the phase; (2) to manage auto setting of prefetcher knobs without great effort from the user; (3) to influence software and hardware prefetching interaction designs in future processors; and (4) to use valuable insights and performance data in many areas such as power provisioning for the nodes in a large cluster to maximize both energy and performance efficiencies.
UR - http://www.scopus.com/inward/record.url?scp=84963526736&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84963526736&partnerID=8YFLogxK
U2 - 10.1109/ICEAC.2015.7352208
DO - 10.1109/ICEAC.2015.7352208
M3 - Conference contribution
AN - SCOPUS:84963526736
T3 - 5th International Conference on Energy Aware Computing Systems and Applications, ICEAC 2015
BT - 5th International Conference on Energy Aware Computing Systems and Applications, ICEAC 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 5th International Conference on Energy Aware Computing Systems and Applications, ICEAC 2015
Y2 - 24 March 2015 through 26 March 2015
ER -