Scratchpad memories (SPMs) have been shown to be more energy efficient and have faster access times than traditional hardware-managed caches. This, coupled with the predictability of data presence, makes SPMs an attractive alternative to cache for many scientific applications. In this work, we consider an SPM based system for increasing the performance and the energy efficiency of sparse matrix-vector multiplication on a chip multi-processor. We ensure the efficient utilization of the SPM by profiling the application for the data structures which do not perform well in traditional cache. We evaluate the impact of using an SPM at all levels of the on-chip memory hierarchy. Our experimental results show an average increase in performance by 13.5-15% and an average decrease in the energy consumption by 28-33% on an 8-core system depending on which level of the hierarchy the SPM is utilized.