期刊论文详细信息
ETRI Journal
Dual Cache Architecture for Low Cost and High Performance
关键词: prefetching;    spatial locality;    temporal locality;    dual data cache;    Memory hierarchy;   
Others  :  1184721
DOI  :  10.4218/etrij.03.0303.0015
PDF
【 摘 要 】

We present a high performance cache structure with a hardware prefetching mechanism that enhances exploitation of spatial and temporal locality. Temporal locality is exploited by selectively moving small blocks into the direct-mapped cache after monitoring their activity in the spatial buffer. Spatial locality is enhanced by intelligently prefetching a neighboring block when a spatial buffer hit occurs. We show that the prefetch operation is highly accurate: over 90% of all prefetches generated are for blocks that are subsequently accessed. Our results show that the system enables the cache size to be reduced by a factor of four to eight relative to a conventional direct-mapped cache while maintaining similar performance.

【 授权许可】

   

【 预 览 】
附件列表
Files Size Format View
20150520103606554.pdf 1145KB PDF download
【 参考文献 】
  • [1]J.L. Baer and T.F. Chen, "An Effective On-Chip Preloading Scheme to Reduce Data Access Penalty," Proc. Int’l Conf. on Supercomputing’91, 1991, pp. 176-186.
  • [2]T. Mowry, M.S. Lam, and A. Gupta, "Design and Evaluation of a Compiler Algorithm for Prefetching," Proc. 5th Int’l Conf. on Architectural Support for Programming Languages and Operating Systems, 1992, pp. 62-73.
  • [3]W.Y. Chen, R.A. Bringmann, S.A. Mahlke, R.E. Hank, and J.E. Sicolo, "An Efficient Architecture for Loop Based Data Preloading," Proc. 25th Int’l Symposium on Microarchitecture, 1992, pp. 92-101.
  • [4]Norman P. Jouppi, "Improving Direct-Mapped Cache Performance by the Addition of a Small Fully Associative Cache and Prefetch Buffers," Proc. 17th ISCA, May 1990, pp. 364-373.
  • [5]D. Stiliadis and A. Varma, "Selective Victim Caching: A Method to Improve the Performance of Direct Mapped Cache," IEEE Trans. Comput., vol. 46, no. 5, May 1997, pp. 603-610.
  • [6]A. Gonzalez, C. Aliagas, and M. Valero, "Data Cache with Multiple Caching Strategies Tuned to Different Types of Locality," Proc. Int’l Conf. on Supercomputing’95, July 1995, pp. 338-347.
  • [7]V. Milutinovic, M. Tomasevic, B. Markovic, and M. Tremblay, "The Split Temporal/Spatial Cache: Initial Performance Analysis," SCIzzL-5, Mar. 1996.
  • [8]G. Kurpanchek et al., "PA-7200: A PA-RISC Processor with Integrated High Performance MP Bus Interface," COMPCON Digest of Papers, Feb. 1994, pp. 375-382.
  • [9]Jude A. Rivers and Edward S. Davidson, "Reducing Conflicts in Direct-Mapped Caches with a Temporality-Based Design," Proc. the 1996 Int’l Conf. on Parallel Processing, vol. I, 1996, pp. 151-162.
  • [10]S. Przybylski, "The Performance Impact of Block Sizes and Fetch Strategies," Proc. 17th ISCA, May 1990, pp. 160-169.
  • [11]F. Jesus Sanchez, Antonio Gonzalez, and Mateo Valeo, "Static Locality Analysis for Cache Management," Proc. PACT’97, Nov. 1997, pp. 261-271.
  • [12]G. Albera and R. Iris Bahar, "Power/Performance Advantages of Victim Buffer in High-Performance Processors," Proc. IEEE Alessandro Volta Memorial Workshop, Mar. 1999, pp. 43-51.
  • [13]V. Srinivasan, Improving Performance of an L1 Cache with an Associated Buffer, CSE-TR-361-98, University of Michigan, Feb. 1998.
  • [14]J.M. Mulder, N.T. Quach, and M.J. Flynn, "An Area Model for On-Chip Memories and its Applications," IEEE J. Solid State Circuits, vol. 26, no. 2, Feb. 1991, pp. 98-106.
  • [15]G. Reinman et al., CACTI 3.0: An Integrated Cache Timing and Power, and Area Model, Compaq WRL Report, August 2001.
  • [16]M.B. Kamble et al., "Energy-Efficiency of VLSI Cache: A Comparative Study," Proc. IEEE 10th Int
  • [17]M.B. Kamble et al., "Analytical Energy Dissipation Models for Low Power Caches," Proc. ISLPED
  文献评价指标  
  下载次数:17次 浏览次数:33次