Slide 15 of 41
Notes:
- Let us look at a simple example that will allow us to see some of the characteristics of the cache
- The following kernel vectorizes well and shows a nice balance among the functional units.
- Because the cache is write through and write allocate, memory stores are written to cache. So the assignment will fill up a cache location as well as the four arrays on the right.
- Given this, for n up to approx 6000 we should see good performance as all arrays will fit in cache
- As n is increased, performance should begin to taper off as cache locations are overwritten with data before they can be reused