The SV1 Data Cache
Each SSP has its own cache for both scalar and vector loads
32K words (256 Kbytes) per SSP
SSP-to-cache bandwidth:
- Two read ports and one write port per pipe
- 4 words/CPU CP for reads (9.6 GB/sec)
- 2 words/CPU CP for writes (4.8 GB/sec)
Four-way set associative
- Replacement policy: Least Recently Used
Line size of eight words for scalar loads
Line size of one word for vector loads
- No wasted bandwidth for irregular strides through main memory
“Write through” - each store goes to main memory
“Write allocate” - each store goes to cache
Notes:
- Leadin: To help improve memory bandwidth, a cache now sits between the processor and memory that services vector operations as well as scalar.
- -The key information to note here is the bandwidth. While main memory to cache was 2.5 gb/s (3.2 theoretical) bandwitdh from cache to cpu is 9.6 gb/s.
- -You can see potentially significant increases in memory bandwidth if the data you need resides in cache
- -Some other important things that have performance implications are….