Example: Vector Code

First

Previous

Next

Last

Index

Text

Slide 15 of 41

Notes:

Let us look at a simple example that will allow us to see some of the characteristics of the cache

The following kernel vectorizes well and shows a nice balance among the functional units.

Because the cache is write through and write allocate, memory stores are written to cache. So the assignment will fill up a cache location as well as the four arrays on the right.

Given this, for n up to approx 6000 we should see good performance as all arrays will fit in cache

As n is increased, performance should begin to taper off as cache locations are overwritten with data before they can be reused