Table of Contents
Cray SV1 Application Performance
Goals
OSC Systems
Outline
SV1 Architecture Highlights
SV1 Architecture
The SV1 Data Cache
The Multi-Stream Processor
The Multi-Stream Processor
Memory Structure
Single Processor Performance Issues
Vector Code Performance
Cache Considerations
Memory Bank Conflicts
Example: Vector Code
In-Cache vs Out-of-Cache
Example: Laplace Equation Solver
SSP Performance: Vector Version
SSP Memory Bandwidth: Vector
SSP Performance: Gaussian98
Gaussian98
Gaussian98
Single Processor Performance Issues Lessons Learned
Multi-Processor Performance Issues
The Multi-Stream Processor
Laplace Equation Solver - Streamed
Computational and Memory Performance
Nonlinear Wave Equation Solver
Code Version 2 - Streamed
Code Version 3 - Streamed
Computational Performance
LS-DYNA
Configurations Examined
Elapsed Time
LS-DYNA Observations
MPI Performance: QCDMPI
QCDMPI
QCDMPI
Lessons Learned
Conclusions
Additional Information
|