CUG Logo

Papers

Scalable High-Fidelity Simulation of Turbulence With Neko Using Accelerators

Authors: Niclas Jansson (KTH Royal Institute of Technology), Martin Karp (KTH Royal Institute of Technology), Jacob Wahlgren (KTH Royal Institute of Technology), Stefano Markidis (KTH Royal Institute of Technology), Philipp Schlatter (KTH Royal Institute of Technologoy)

Abstract: Recent trends and advancements in including more diverse and heterogeneous hardware in High-Performance Computing are challenging scientific software developers in their pursuit of efficient numerical methods with sustained performance across a diverse set of platforms. As a result, researchers are today forced to re-factor their codes to leverage these powerful new heterogeneous systems. We present Neko – a portable framework for high-fidelity spectral element flow simulations. Unlike prior works, Neko adopts a modern object-oriented Fortran 2008 approach, allowing multi-tier abstractions of the solver stack and facilitating various hardware backends ranging from general-purpose processors, accelerators down to exotic vector processors and Field-Programmable Gate Arrays (FPGAs). Focusing on the performance and portability of Neko, we describe the framework's device abstraction layer managing device memory, data transfer and kernel launches from Fortran, allowing for a solver written in a hardware-neutral yet performant way. Accelerator specific optimisations are also discussed, with auto-tuning of key kernels and various communication strategies using device-aware MPI. Finally, we present performance measurements on a wide range of computing platforms, including the EuroHPC pre-exascale system LUMI, where Neko achieves excellent parallel efficiency for a large DNS of turbulent fluid flow using up to 80% of the entire LUMI supercomputer.

Long Description: Recent trends and advancements in including more diverse and heterogeneous hardware in High-Performance Computing are challenging scientific software developers in their pursuit of efficient numerical methods with sustained performance across a diverse set of platforms. As a result, researchers are today forced to re-factor their codes to leverage these powerful new heterogeneous systems. We present Neko – a portable framework for high-fidelity spectral element flow simulations. Unlike prior works, Neko adopts a modern object-oriented Fortran 2008 approach, allowing multi-tier abstractions of the solver stack and facilitating various hardware backends ranging from general-purpose processors, accelerators down to exotic vector processors and Field-Programmable Gate Arrays (FPGAs). Focusing on the performance and portability of Neko, we describe the framework's device abstraction layer managing device memory, data transfer and kernel launches from Fortran, allowing for a solver written in a hardware-neutral yet performant way. Accelerator specific optimisations are also discussed, with auto-tuning of key kernels and various communication strategies using device-aware MPI. Finally, we present performance measurements on a wide range of computing platforms, including the EuroHPC pre-exascale system LUMI, where Neko achieves excellent parallel efficiency for a large DNS of turbulent fluid flow using up to 80% of the entire LUMI supercomputer.

Paper: PDF



Back to Papers Archive Listing