CUG2025 Proceedings


A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | R | S | T | U | V | W | X | Y | Z

A
Abraham, Subil · moreCPE in a Container · view
Evaluating the Performance of Containerized ML and LLM Applications on the Frontier and Odo Supercomputers · pdf, pdf · view
Accola, Michael · moreProactive Health Monitoring and Maintenance of High-Speed Slingshot Fabrics in HPC Environments · pdf, pdf · view
Acreman, David · moreBit-reproducibility in UK Met Office Weather and Climate Applications · pdf · view
Ahobala Rao, Ramya · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view
Aielli, Roberto · moreRedefining Weather Forecasting Systems: The Transition to ICON and Alps · pdf, pdf · view
Akinyemi, Emma · moreCo-design, deployment and operation of a Modular Data Centre (MDC) with air and direct-liquid cooled supercomputers · pdf, pdf · view
Alam, Sadaf · moreEvaluation of the Nvidia Grace Superchip in the HPE/Cray XD Isambard 3 supercomputer · pdf, pdf · view
Co-design, deployment and operation of a Modular Data Centre (MDC) with air and direct-liquid cooled supercomputers · pdf, pdf · view
Rethinking Interactive HPC Resource Access: Enhancing Security and Flexibility · pdf, pdf · view
Kubernetes on HPE Supercomputers BoF · pdf, pdf · view
Alaoui, Naïma · moreOptimizing GPU Frequency for Sustainable HPC: Lessons Learned from a Year of Production on Adastra, an AMD GPU Supercomputer · pdf, pdf · view
Ali, Atif · moreDynamic Network Perimeterization: Isolating Tenant Workloads With VLANs, VNIs, & ACLs · pdf, pdf · view
Allan, Ben · moreLDMS New Features for Deployment in Advanced Environments and Feedback for Operations · pdf · view
Allen, Ben · moreAddressing Resource Constraints on Aurora with Admin Access Nodes · pdf, pdf · view
Anderson, Kaylie · moreCPE in a Container · view
CPE Futures · view
Anisimov, Victor · moreScaling MPI Applications on Aurora · pdf, pdf · view
Arenaz, Manuel · moreCodee: A Tool to Enhance Correctness, Modernization, Security, Portability and Optimization in Fortran and C/C++ Software Applications · pdf · view
Exploring the Challenges of the World-Class HPE Cray Programming Environment for Modern Software Development in Fortran · pdf · view
Automated Inspection of Fortran/C/C++ Code Using Codee for Correctness, Modernization, Optimization, and Security on HPE/Cray · pdf, pdf · view
Arndt, William · moreAccelerating LArTPC Simulations: Enhancing larnd-sim with GPU Optimization Techniques · pdf · view
Ashton, Alun · moreInfrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Azgomi, Houfar · moreThe HPE Slingshot 400 Expedition · pdf, pdf · view

B
Bachman, Scott · moreMARBLChapel: Fortran-Chapel Interoperability in an Ocean Simulation · pdf · view
Barker, Ashley · morePanel: The Future of Precision in HPC, which FP is the Right One? · view
CUG Organizational Update · view
Welcome from the CUG President, Ashley Barker · view
Barlow, Aaron · moreEmploying a Software-Driven Approach to Scalable HPC System Management · pdf · view
Barnes, Ross · moreCo-design, deployment and operation of a Modular Data Centre (MDC) with air and direct-liquid cooled supercomputers · pdf, pdf · view
Basri K S, Shreyas Vinayaka · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view
Basso, Matteo · moreExperimenting with Security Compliance Checking using ReFrame · pdf, pdf · view
Beausoleil, Ray · moreA Full Stack Framework for High Performance Quantum-Classical Computing · pdf, pdf · view
Benini, Massimo · moreExperimenting with Security Compliance Checking using ReFrame · pdf, pdf · view
CUG SIG System Monitoring Working Group BoF · pdf, pdf · view
Bhatia, Vishal · moreDynamic Network Perimeterization: Isolating Tenant Workloads With VLANs, VNIs, & ACLs · pdf, pdf · view
Bhattacharya, Suparna · moreFramework for tracking metadata, lineage and model provenance in hybrid simulation-AI HPC exascale workflows · pdf, pdf · view
Bianco, Mauro · moreRedefining Weather Forecasting Systems: The Transition to ICON and Alps · pdf, pdf · view
Biddiscombe, John · moreModern Software Deployment on a Multi-Tenant Cray-EX System · pdf · view
Bissa, Ravi · moreCSM updates, iSCSI boot content projection, and other CSM topics · pdf · view
Slingshot Host Software Ethernet Tuning · pdf · view
Blackworth, Cyrus · moreAddressing Resource Constraints on Aurora with Admin Access Nodes · pdf, pdf · view
Bonesana, Ivano · moreFirecREST v2: Lessons Learned from Redesigning an API for Scalable HPC Resource Access · pdf, pdf · view
CSCS' journey towards complete platform automation in a multi-tenant environment · pdf · view
Brandt, Jim · moreLDMS New Features for Deployment in Advanced Environments and Feedback for Operations · pdf · view
Bresniker, Kirk · moreA Full Stack Framework for High Performance Quantum-Classical Computing · pdf, pdf · view
Brewer, Wesley · moreCausality inference for Digital Twins in GPU Data Centers and Smart Grids. · pdf, pdf · view
Brickman, Bryan · moreHarvesting, Storing and Processing Data from our HPCM Systems · pdf, pdf · view
Brown, Kevin · moreAnalyzing a Lifetime of Failures on a Cray XC40 Supercomputer · pdf, pdf · view
Brown, Nick · moreWhat is RISC-V and why should we care? · pdf, pdf · view
Byrne, John L · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view

C
C, Amitha · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view
Cain, Kenneth · moreExploring High Performance Storage with DAOS · pdf · view
Calder, Alan · moreSupernovae in HPC: Benchmarking FLASH Across Advanced Computing Clusters · pdf · view
Carlson, Dave · morePython Management · view
Utilization and Performance Monitoring of Ookami, an ARM Fujitsu A64FX Testbed Cluster with XDMoD · pdf · view
Carns, Philip · moreEnhancing RPC on Slingshot for Aurora’s DAOS Storage System · pdf, pdf · view
Expanding Community Access to Real-World HPC Application I/O Characterization Data Using Darshan · pdf, pdf · view
Towards Empirical Roofline Modeling of Distributed Data Services: Mapping the Boundaries of RPC Throughput · pdf, pdf · view
Carothers, Christopher · moreAnalyzing a Lifetime of Failures on a Cray XC40 Supercomputer · pdf, pdf · view
Carrier, John · moreEnhancing RPC on Slingshot for Aurora’s DAOS Storage System · pdf, pdf · view
Caubet, Marc · moreInfrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Ceriani, Andrea · moreFirecREST v2: Lessons Learned from Redesigning an API for Scalable HPC Resource Access · pdf, pdf · view
Chaarawi, Mohamad · moreEnhancing RPC on Slingshot for Aurora’s DAOS Storage System · pdf, pdf · view
Exploring High Performance Storage with DAOS · pdf · view
Chapman, Barbara · moreCPE Testing · view
A Full Stack Framework for High Performance Quantum-Classical Computing · pdf, pdf · view
CPE Futures · view
EVeREST: An Effective and Versatile Runtime Energy Saving Tool for GPUs · pdf · view
Chatterjee, Soumitra · moreA Full Stack Framework for High Performance Quantum-Classical Computing · pdf, pdf · view
Childers, Lisa · moreHarvesting, Storing and Processing Data from our HPCM Systems · pdf, pdf · view
Coles, Jonathan · moreA journey to provide GH200 · pdf · view
Conciatore, Dino · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
Kubernetes on HPE Supercomputers BoF · pdf, pdf · view
Cook, Brandon · moreCPE Testing · view
MPI implementation optimization for Slingshot network · pdf, pdf · view
HPC workload characterization using eBPF · pdf, pdf · view
Coverston, Jason · moreCSM updates, iSCSI boot content projection, and other CSM topics · pdf · view
Crasta, Clarete R. · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view
Cruz, Felipe · moreEvolving Sarus to augment Podman for HPC on Cray EX · pdf · view
Cumming, Ben · moreCPE in a Container · view
Modern Software Deployment on a Multi-Tenant Cray-EX System · pdf · view
Hands on with uenv and CPE in a container with Grace Hopper on Alps · pdf · view
Cush, Michael · moreProactive Health Monitoring and Maintenance of High-Speed Slingshot Fabrics in HPC Environments · pdf, pdf · view

D
Dabin, Alejandro · moreFirecREST v2: Lessons Learned from Redesigning an API for Scalable HPC Resource Access · pdf, pdf · view
CSCS' journey towards complete platform automation in a multi-tenant environment · pdf · view
Dahal, Bishwo · moreEvaluating the Performance of Containerized ML and LLM Applications on the Frontier and Odo Supercomputers · pdf, pdf · view
Dai, Dong · moreTowards Empirical Roofline Modeling of Distributed Data Services: Mapping the Boundaries of RPC Throughput · pdf, pdf · view
Damkroger, Trish · moreHPE 1 on 100 with Trish Damkroger (HPE Customers only. No HPE partners or CUG sponsors) · view
CUG2026 site presentation · view
Dave, Rishabh · moreAccelerating LArTPC Simulations: Enhancing larnd-sim with GPU Optimization Techniques · pdf · view
Davi, Caio · moreHPE Slingshot in the Kubernetes Ecosystem · pdf, pdf · view
Dessouky, Monica · morePragmatic Security Audits: Fortifying HPC Environments at a Consumable Pace · pdf, pdf · view
Dey, Troy · moreSystem Visualization Using Rackmap · view
Dhakal, Aditya · moreCausality inference for Digital Twins in GPU Data Centers and Smart Grids. · pdf, pdf · view
Di Maria, Riccardo · moreInfrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Separating concerns: Decoupling the Slingshot Fabric Manager from Cray System Management · pdf · view
Di Pietrantonio, Cristian · moreCPE Testing · view
Python Management · view
Porting Radio Astronomy Correlation to Setonix, a HPE Cray EX system powered by AMD GPUs · pdf, pdf · view
Sharing is Caring: Tackling Node-Sharing Challenges at CUG Sites · pdf · view
Dickmann, Dennis · moreEvaluating AMD MI300A APU: Performance Insights on LLM Training via Knowledge Distillation · pdf, pdf · view
Ding, Pengfei · moreSharing is Caring: Tackling Node-Sharing Challenges at CUG Sites · pdf · view
Accelerating LArTPC Simulations: Enhancing larnd-sim with GPU Optimization Techniques · pdf · view
Doherty, Ronan · moreAccelerating LArTPC Simulations: Enhancing larnd-sim with GPU Optimization Techniques · pdf · view
Donato, Evan · moreLDMS New Features for Deployment in Advanced Environments and Feedback for Operations · pdf · view
Dong, Wenqian · moreFramework for tracking metadata, lineage and model provenance in hybrid simulation-AI HPC exascale workflows · pdf, pdf · view
Dorier, Matthieu · moreTowards Empirical Roofline Modeling of Distributed Data Services: Mapping the Boundaries of RPC Throughput · pdf, pdf · view
Dorsch, Juan Pablo · moreFirecREST v2: Lessons Learned from Redesigning an API for Scalable HPC Resource Access · pdf, pdf · view
Drescher, Lukas · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
Dwaraki, Abhishek · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view

E
Elwasif, Wael · moreFine-Grained Application Energy and Power Measurements on the Frontier Exascale System · pdf, pdf · view
Emberson, David · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view
Enkovaara, Jussi · moreEnabling km-scale coupled climate simulations with ICON on AMD GPUs · pdf · view
Evans, Lance · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view

F
Faanes, Gregory · moreThe HPE Slingshot 400 Expedition · pdf, pdf · view
Faraboschi, Paolo · moreFramework for tracking metadata, lineage and model provenance in hybrid simulation-AI HPC exascale workflows · pdf, pdf · view
Feichtinger, Derek · moreInfrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Fink, Andreas · moreModern Software Deployment on a Multi-Tenant Cray-EX System · pdf · view
Flores, Leo · moreHPE Cray EX225a (MI300a) Blade Power Capping and HBM Page Retirement · pdf · view
Foltin, Martin · moreFramework for tracking metadata, lineage and model provenance in hybrid simulation-AI HPC exascale workflows · pdf, pdf · view
Friesen, Brian · moreHPC workload characterization using eBPF · pdf, pdf · view
Fuhrer, Oliver · moreRedefining Weather Forecasting Systems: The Transition to ICON and Alps · pdf, pdf · view

G
Gamboni, Chris · moreInfrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Separating concerns: Decoupling the Slingshot Fabric Manager from Cray System Management · pdf · view
Gayatri, Rahulkumar · moreMPI implementation optimization for Slingshot network · pdf, pdf · view
Geil, Afton · moreMPI implementation optimization for Slingshot network · pdf, pdf · view
Gentile, Ann · moreLDMS New Features for Deployment in Advanced Environments and Feedback for Operations · pdf · view
Germann, Elsa · moreInfrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Ghosh, Chinmay · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view
Gila, Miguel · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
Infrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
CSCS' journey towards complete platform automation in a multi-tenant environment · pdf · view
A journey to provide GH200 · pdf · view
Godfrey, Forest · moreProactive Health Monitoring and Maintenance of High-Speed Slingshot Fabrics in HPC Environments · pdf, pdf · view
Math in Your Network: Slingshot Hardware Accelerated Reductions · pdf · view
Slingshot Host Software Ethernet Tuning · pdf · view
Best Practices For Operating and Maintaining Slingshot Fabrics · pdf · view
Green, Jennifer · moreLDMS New Features for Deployment in Advanced Environments and Feedback for Operations · pdf · view
Green, Thomas · moreEvaluation of the Nvidia Grace Superchip in the HPE/Cray XD Isambard 3 supercomputer · pdf, pdf · view
Gsell, Achim · moreInfrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Gueroudji, Amal · moreTowards Empirical Roofline Modeling of Distributed Data Services: Mapping the Boundaries of RPC Throughput · pdf, pdf · view
Gupta, Lipi · moreCUG 2026 Elections: Candidate Statements · view
Guyan, Pete · moreCUG SIG System Monitoring Working Group BoF · pdf, pdf · view
Managing System Reliability: From system acceptance through production · view
A Brief Summary of the HPCM (HPE Performance Cluster Manager) Evolution Over Recent Releases · view
System Visualization Using Rackmap · view
Monitoring HPE Cray HPC systems · pdf, gz, gz · view

H
Hagerty, Nicholas · moreDeploying and Tracking Software with NCCS Software Provisioning · pdf, pdf · view
Han, Stephen · moreDynamic Network Perimeterization: Isolating Tenant Workloads With VLANs, VNIs, & ACLs · pdf, pdf · view
Hanson, Jeff · moreCUG SIG System Monitoring Working Group BoF · pdf, pdf · view
Harake, Hussein · moreInfrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Harms, Kevin · moreEnhancing RPC on Slingshot for Aurora’s DAOS Storage System · pdf, pdf · view
Expanding Community Access to Real-World HPC Application I/O Characterization Data Using Darshan · pdf, pdf · view
Harris, Christopher · morePorting Radio Astronomy Correlation to Setonix, a HPE Cray EX system powered by AMD GPUs · pdf, pdf · view
Harris, Naomi · moreCo-design, deployment and operation of a Modular Data Centre (MDC) with air and direct-liquid cooled supercomputers · pdf, pdf · view
Harrison, Robert · moreWelcome by Stony Brook University · view
Utilization and Performance Monitoring of Ookami, an ARM Fujitsu A64FX Testbed Cluster with XDMoD · pdf · view
Harshbarger, Ben · moreMARBLChapel: Fortran-Chapel Interoperability in an Ocean Simulation · pdf · view
Hautreux, Gabriel · moreOptimizing GPU Frequency for Sustainable HPC: Lessons Learned from a Year of Production on Adastra, an AMD GPU Supercomputer · pdf, pdf · view
Heichler, Jan · moreVAST Data Platform · pdf · view
Hennecke, Michael · moreDAOS - New Horizons for High Performance Storage · pdf · view
Hernandez, Oscar · moreFine-Grained Application Energy and Power Measurements on the Frontier Exascale System · pdf, pdf · view
Quantifying Message Aggregation Optimisations for Energy Savings in PGAS Models · pdf, pdf · view
Herrera, Juan · morePython Management · view
Hoefler, Torsten · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
Holanda Rusu, Victor · moreExperimenting with Security Compliance Checking using ReFrame · pdf, pdf · view
Hong Enriquez, Rolando Pablo · moreCausality inference for Digital Twins in GPU Data Centers and Smart Grids. · pdf, pdf · view
Hoppe, Dennis · moreEvaluating AMD MI300A APU: Performance Insights on LLM Training via Knowledge Distillation · pdf, pdf · view

I
Ibeid, Huda · moreScaling MPI Applications on Aurora · pdf, pdf · view

J
Jackson, Adrian · moreExploring High Performance Storage with DAOS · pdf · view
Jain, Amit · moreDynamic Network Perimeterization: Isolating Tenant Workloads With VLANs, VNIs, & ACLs · pdf, pdf · view
Jansson, Niclas · moreTask-decomposed Overlapped Pressure Preconditioner for Sustained Strong Scalability on Accelerated Exascale Systems · pdf · view
Johnson, K. Grace · moreA Full Stack Framework for High Performance Quantum-Classical Computing · pdf, pdf · view
Jones, Matthew D. · moreUtilization and Performance Monitoring of Ookami, an ARM Fujitsu A64FX Testbed Cluster with XDMoD · pdf · view
Jourdain, Cedric · moreCPE Testing · view

K
Kabel, Jeff · moreProactive Health Monitoring and Maintenance of High-Speed Slingshot Fabrics in HPC Environments · pdf, pdf · view
Kaplan, Larry · moreDetecting operating system noise with detect-detour · pdf, pdf · view
Scaling MPI Applications on Aurora · pdf, pdf · view
Rethinking Interactive HPC Resource Access: Enhancing Security and Flexibility · pdf, pdf · view
BoF on Transforming Hybrid Workflows: The Role of HPE Cray Supercomputing User Services Software in Bridging HPC and AI · pdf · view
HPE Cray EX225a (MI300a) Blade Power Capping and HBM Page Retirement · pdf · view
Karanth, Vinay · moreDynamic Network Perimeterization: Isolating Tenant Workloads With VLANs, VNIs, & ACLs · pdf, pdf · view
Kayabay, Kerem · moreEvaluating AMD MI300A APU: Performance Insights on LLM Training via Knowledge Distillation · pdf, pdf · view
Keller, Patrick · moreEvaluating AMD MI300A APU: Performance Insights on LLM Training via Knowledge Distillation · pdf, pdf · view
Khalsa, Siri Vias · moreFrom Weeks to Hours: Harnessing Configuration Management and Deployment Pipelines · pdf, pdf · view
Dynamic Network Perimeterization: Isolating Tenant Workloads With VLANs, VNIs, & ACLs · pdf, pdf · view
CSM updates, iSCSI boot content projection, and other CSM topics · pdf · view
Khosravi, Ali · moreFirecREST v2: Lessons Learned from Redesigning an API for Scalable HPC Resource Access · pdf, pdf · view
Klein, Mark · moreAlps, a versatile research infrastructure · pdf, pdf · view
Infrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Separating concerns: Decoupling the Slingshot Fabric Manager from Cray System Management · pdf · view
A journey to provide GH200 · pdf · view
Kleyn, Gerald · moreHPE Corporate Update, Gerald Kleyn · view
KN, Nagaraju · moreDetecting operating system noise with detect-detour · pdf, pdf · view
Koomthanam, Annmary Justine · moreFramework for tracking metadata, lineage and model provenance in hybrid simulation-AI HPC exascale workflows · pdf, pdf · view
Koutsaniti, Eirini · moreFirecREST v2: Lessons Learned from Redesigning an API for Scalable HPC Resource Access · pdf, pdf · view
Kramer, Matt · moreAccelerating LArTPC Simulations: Enhancing larnd-sim with GPU Optimization Techniques · pdf · view
Kraushaar, Matthias · moreRedefining Weather Forecasting Systems: The Transition to ICON and Alps · pdf, pdf · view
Krotkiewski, Marcin · moreUsing Different MPI Implementations on HPE Cray EX Supercomputers for Native and Containerized Applications Execution ​ · pdf · view
Kulkarni, Shreyas · moreFramework for tracking metadata, lineage and model provenance in hybrid simulation-AI HPC exascale workflows · pdf, pdf · view
Kumaran, Kalyan · moreScaling MPI Applications on Aurora · pdf, pdf · view
Kuno, Harumi · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view
Kwack, JaeHyuk · moreScaling MPI Applications on Aurora · pdf, pdf · view

L
Lan, Zhiling · moreAnalyzing a Lifetime of Failures on a Cray XC40 Supercomputer · pdf, pdf · view
Lasoń, Patryk · moreNew Member Site: Introducing Cyfronet · pdf · view
Latham, Robert · moreExpanding Community Access to Real-World HPC Application I/O Characterization Data Using Darshan · pdf, pdf · view
Towards Empirical Roofline Modeling of Distributed Data Services: Mapping the Boundaries of RPC Throughput · pdf, pdf · view
Lavely, Adam · moreMPI implementation optimization for Slingshot network · pdf, pdf · view
Law, Randy · moreHPE Cray EX225a (MI300a) Blade Power Capping and HBM Page Retirement · pdf · view
Lazzaro, Alfio · moreUsing Different MPI Implementations on HPE Cray EX Supercomputers for Native and Containerized Applications Execution ​ · pdf · view
Lee, Gwangmu · moreEvolving Sarus to augment Podman for HPC on Cray EX · pdf · view
Lee, Sekwon · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view
Lenard, Ben · moreAddressing Resource Constraints on Aurora with Admin Access Nodes · pdf, pdf · view
Harvesting, Storing and Processing Data from our HPCM Systems · pdf, pdf · view
Liang, Zhen · moreEnhancing RPC on Slingshot for Aurora’s DAOS Storage System · pdf, pdf · view
Littlewood, Ian · moreHow Best to Leverage Cloud for (Big) HPC Sites · pdf · view
Loewe, William · moreE2000 Performance From Microbenchmarks to Applications · pdf, pdf · view
Lombardi, Johann · moreEnhancing RPC on Slingshot for Aurora’s DAOS Storage System · pdf, pdf · view
Longley, Harold · moreBuilding non-standard images for CSM systems · pdf, pdf · view
CSM updates, iSCSI boot content projection, and other CSM topics · pdf · view
Monitoring HPE Cray HPC systems · pdf, gz, gz · view
Lopatina, Lena · moreCUG SIG System Monitoring Working Group BoF · pdf, pdf · view
Lovell-Troy, Alex · moreFrom Weeks to Hours: Harnessing Configuration Management and Deployment Pipelines · pdf, pdf · view
Lueninghoener, Cory · moreLDMS New Features for Deployment in Advanced Environments and Feedback for Operations · pdf · view

M
M, Ashalatha A. · moreCSM updates, iSCSI boot content projection, and other CSM topics · pdf · view
Maccarthy, Elijah · moreEvaluating the Performance of Containerized ML and LLM Applications on the Frontier and Odo Supercomputers · pdf, pdf · view
Deploying and Tracking Software with NCCS Software Provisioning · pdf, pdf · view
Madonna, Alberto · moreEvolving Sarus to augment Podman for HPC on Cray EX · pdf · view
Mahadevan, Nilakantan · moreScaling MPI Applications on Aurora · pdf, pdf · view
Maiterth, Matthias · moreCausality inference for Digital Twins in GPU Data Centers and Smart Grids. · pdf, pdf · view
Malaboeuf, Etienne · moreOptimizing GPU Frequency for Sustainable HPC: Lessons Learned from a Year of Production on Adastra, an AMD GPU Supercomputer · pdf, pdf · view
Malaya, Nicholas · moreAMD: The Unreasonable Effectiveness of FP64 Precision Arithmetic · view
Mallick, Tanwi · moreAnalyzing a Lifetime of Failures on a Cray XC40 Supercomputer · pdf, pdf · view
Markomanolis, George · moreEvaluating AMD MI300A APU: Performance Insights on LLM Training via Knowledge Distillation · pdf, pdf · view
Performance Analysis on AMD GPUs · pdf · view
Marsella, Luca · moreDivide and Rule: Automated Workload Distribution for Efficient User Support Services · pdf · view
Martin, Joshua · moreSupernovae in HPC: Benchmarking FLASH Across Advanced Computing Clusters · pdf · view
Martin, Steven · moreHPE Cray EX225a (MI300a) Blade Power Capping and HBM Page Retirement · pdf · view
Martinasso, Maxime · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
Alps, a versatile research infrastructure · pdf, pdf · view
Infrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Rethinking Interactive HPC Resource Access: Enhancing Security and Flexibility · pdf, pdf · view
AlpsB – a Geographically Distributed Infrastructure to Facilitate Large-Scale Training of Weather and Climate AI Models · pdf · view
McIntosh-Smith, Simon · moreCo-design, deployment and operation of a Modular Data Centre (MDC) with air and direct-liquid cooled supercomputers · pdf, pdf · view
Mehta, Abhishek · moreHardware Triage Tool: Enhancements and Extensions · pdf · view
Mehta, Neil · moreMPI implementation optimization for Slingshot network · pdf, pdf · view
Mehta, Sanyam · moreEVeREST: An Effective and Versatile Runtime Energy Saving Tool for GPUs · pdf · view
Mendonca, Henrique · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
Miller, Sue · moreManaging System Reliability: From system acceptance through production · view
A Brief Summary of the HPCM (HPE Performance Cluster Manager) Evolution Over Recent Releases · view
Monitoring HPE Cray HPC systems · pdf, gz, gz · view
Milojicic, Dejan · moreCausality inference for Digital Twins in GPU Data Centers and Smart Grids. · pdf, pdf · view
Mishra, Tulsi · moreBoF on Transforming Hybrid Workflows: The Role of HPE Cray Supercomputing User Services Software in Bridging HPC and AI · pdf · view
Mohamed, Fawzi · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
Mohseni, Masoud · moreA Full Stack Framework for High Performance Quantum-Classical Computing · pdf, pdf · view
Moore, Dave · moreCo-design, deployment and operation of a Modular Data Centre (MDC) with air and direct-liquid cooled supercomputers · pdf, pdf · view
Moore, Michael · moreE2000 Performance From Microbenchmarks to Applications · pdf, pdf · view
Morecroft, Lee · moreA Brief Summary of the HPCM (HPE Performance Cluster Manager) Evolution Over Recent Releases · view
Morozov, Vitali · moreScaling MPI Applications on Aurora · pdf, pdf · view
Mujkanovic, Nina · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
Mukundan, Nikhil · moreDynamic Network Perimeterization: Isolating Tenant Workloads With VLANs, VNIs, & ACLs · pdf, pdf · view
Müller, Markus Michael · moreNew Member Site: Introducing LRZ · pdf · view
Muralidharan, Servesh · moreScaling MPI Applications on Aurora · pdf, pdf · view

N
Nalla, Ravikanth · moreCSM updates, iSCSI boot content projection, and other CSM topics · pdf · view
Neth, Brandon · moreMARBLChapel: Fortran-Chapel Interoperability in an Ocean Simulation · pdf · view
Nguyen, Anthony-Trung · moreScaling MPI Applications on Aurora · pdf, pdf · view
Nishtala, Aditya · moreScaling MPI Applications on Aurora · pdf, pdf · view
Nitzberg, Bill · moreAltair: AI/ML Intelligent Scheduling for HPC with Altair® · pdf · view
How Best to Leverage Cloud for (Big) HPC Sites · pdf · view

O
Ockerman, Seth · moreTowards Empirical Roofline Modeling of Distributed Data Services: Mapping the Boundaries of RPC Throughput · pdf, pdf · view
Offenhäuser, Philipp · moreEvaluating AMD MI300A APU: Performance Insights on LLM Training via Knowledge Distillation · pdf, pdf · view
Oganezov, Alexander · moreEnhancing RPC on Slingshot for Aurora’s DAOS Storage System · pdf, pdf · view
Okuno, William · moreDesigning GPU-aware OpenSHMEM for HPE Cray EX and XD Systems · pdf, pdf · view
Over, Jan · moreCo-design, deployment and operation of a Modular Data Centre (MDC) with air and direct-liquid cooled supercomputers · pdf, pdf · view

P
Pachchigar, Shubh · moreHPC workload characterization using eBPF · pdf, pdf · view
Pagnamenta, Francesco · moreFirecREST v2: Lessons Learned from Redesigning an API for Scalable HPC Resource Access · pdf, pdf · view
Palme, Elia · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
FirecREST v2: Lessons Learned from Redesigning an API for Scalable HPC Resource Access · pdf, pdf · view
Parker, Scott · moreScaling MPI Applications on Aurora · pdf, pdf · view
Passerini, Marco · moreInfrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Patel, Sahil · moreHPE Slingshot Monitoring Software: Actionable Insights for HPC and AI Systems · pdf · view
Pawlik, Maciej · moreUsing Different MPI Implementations on HPE Cray EX Supercomputers for Native and Containerized Applications Execution ​ · pdf · view
Peirce, Scott · moreEnhancing RPC on Slingshot for Aurora’s DAOS Storage System · pdf, pdf · view
Pershey, Eric · moreHarvesting, Storing and Processing Data from our HPCM Systems · pdf, pdf · view
Phadke, Vinanti · moreHardware Triage Tool: Enhancements and Extensions · pdf · view
Pintarelli, Simon · moreModern Software Deployment on a Multi-Tenant Cray-EX System · pdf · view
Pizzi, Giovanni · moreFirecREST v2: Lessons Learned from Redesigning an API for Scalable HPC Resource Access · pdf, pdf · view
Podstata, Martin · moreCo-design, deployment and operation of a Modular Data Centre (MDC) with air and direct-liquid cooled supercomputers · pdf, pdf · view
Poole, Stephen · moreQuantifying Message Aggregation Optimisations for Energy Savings in PGAS Models · pdf, pdf · view
Poole, Wendy · moreQuantifying Message Aggregation Optimisations for Energy Savings in PGAS Models · pdf, pdf · view
Posada Correa, Edwin F. · moreDeploying and Tracking Software with NCCS Software Provisioning · pdf, pdf · view
Pozsa, Krisztian · moreInfrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Prakash, Pavana · moreCausality inference for Digital Twins in GPU Data Centers and Smart Grids. · pdf, pdf · view
Price, Daniel · morePorting Radio Astronomy Correlation to Setonix, a HPE Cray EX system powered by AMD GPUs · pdf, pdf · view

R
Rahman, Md · moreDesigning GPU-aware OpenSHMEM for HPE Cray EX and XD Systems · pdf, pdf · view
Rajak, Rishi Kesh Kumar · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view
Rajesh, Bhuvan Meda · moreHardware Triage Tool: Enhancements and Extensions · pdf · view
Ravichandrasekaran, Naveen Namashivayam · moreDesigning GPU-aware OpenSHMEM for HPE Cray EX and XD Systems · pdf, pdf · view
Ravishankar, Sriram · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view
Rentschler, Asa · moreDeploying and Tracking Software with NCCS Software Provisioning · pdf, pdf · view
Rickett, Christopher · moreSearch and Query Framework for Workflows with HPC and AI Models · pdf, pdf · view
Rigazzi, Alessandro · moreEvaluating AMD MI300A APU: Performance Insights on LLM Training via Knowledge Distillation · pdf, pdf · view
Robinson, Tim · moreSharing is Caring: Tackling Node-Sharing Challenges at CUG Sites · pdf · view
Hands on with uenv and CPE in a container with Grace Hopper on Alps · pdf · view
Roe, Dean · moreDetecting operating system noise with detect-detour · pdf, pdf · view
BoF on Transforming Hybrid Workflows: The Role of HPE Cray Supercomputing User Services Software in Bridging HPC and AI · pdf · view
Ronaghan, Elliot Joseph · moreDesigning GPU-aware OpenSHMEM for HPE Cray EX and XD Systems · pdf, pdf · view
Ross, Robert · moreTowards Empirical Roofline Modeling of Distributed Data Services: Mapping the Boundaries of RPC Throughput · pdf, pdf · view
Roweth, Duncan · moreThe HPE Slingshot 400 Expedition · pdf, pdf · view
Math in Your Network: Slingshot Hardware Accelerated Reductions · pdf · view
Slingshot Host Software Ethernet Tuning · pdf · view

S
Saini, Martin Shivraj · moreNew Member Site: Introducing GeoSphere · pdf · view
Sakarda, Premanand · moreScaling MPI Applications on Aurora · pdf, pdf · view
Samar, Sakib · moreE2000 Performance From Microbenchmarks to Applications · pdf, pdf · view
Sammuli, Brian · moreFramework for tracking metadata, lineage and model provenance in hybrid simulation-AI HPC exascale workflows · pdf, pdf · view
Sarmiento, Rafael · moreFirecREST v2: Lessons Learned from Redesigning an API for Scalable HPC Resource Access · pdf, pdf · view
Saxena, Rishabh · moreEvaluating AMD MI300A APU: Performance Insights on LLM Training via Knowledge Distillation · pdf, pdf · view
Scantlin, Aaron · moreSecurity BoF · pdf · view
Schmit, Michael · moreProactive Health Monitoring and Maintenance of High-Speed Slingshot Fabrics in HPC Environments · pdf, pdf · view
Schulthess, Thomas · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
Alps, a versatile research infrastructure · pdf, pdf · view
Redefining Weather Forecasting Systems: The Transition to ICON and Alps · pdf, pdf · view
Infrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
A journey to provide GH200 · pdf · view
Schuppli, Stefano · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
Schwaller, Ben · moreLDMS New Features for Deployment in Advanced Environments and Feedback for Operations · pdf · view
Selwood, Paul · moreRev Up Compute Node Reboots: 2x to 5x Faster · pdf, pdf · view
Shand, Rudy · moreLinaro: Unlocking Exascale Debugging and Performance Engineering with Linaro Forge · pdf · view
Shao, Andrew · moreFramework for tracking metadata, lineage and model provenance in hybrid simulation-AI HPC exascale workflows · pdf, pdf · view
Sharma, Rishabh · moreFramework for tracking metadata, lineage and model provenance in hybrid simulation-AI HPC exascale workflows · pdf, pdf · view
Shome, Porno · moreGlobal Distributed Client-side Cache for DAOS · pdf, pdf · view
Shyamshankar, Panchapakesan Chitra · moreCPE in a Container · view
Siegmann, Eva · moreSupernovae in HPC: Benchmarking FLASH Across Advanced Computing Clusters · pdf · view
Utilization and Performance Monitoring of Ookami, an ARM Fujitsu A64FX Testbed Cluster with XDMoD · pdf · view
Sikich, Danielle · moreDesigning GPU-aware OpenSHMEM for HPE Cray EX and XD Systems · pdf, pdf · view
Simakov, Nikolay A. · moreUtilization and Performance Monitoring of Ookami, an ARM Fujitsu A64FX Testbed Cluster with XDMoD · pdf · view
Sinha, Urjoshi · moreAccelerating LArTPC Simulations: Enhancing larnd-sim with GPU Optimization Techniques · pdf · view
Snyder, Clark · moreDetecting operating system noise with detect-detour · pdf, pdf · view
Snyder, Shane · moreExpanding Community Access to Real-World HPC Application I/O Characterization Data Using Darshan · pdf, pdf · view
Towards Empirical Roofline Modeling of Distributed Data Services: Mapping the Boundaries of RPC Throughput · pdf, pdf · view
Sokolowski, Marcin · morePorting Radio Astronomy Correlation to Setonix, a HPE Cray EX system powered by AMD GPUs · pdf, pdf · view
Sopena Ballesteros, Manuel · moreInfrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view
Soumagne, Jerome · moreEnhancing RPC on Slingshot for Aurora’s DAOS Storage System · pdf, pdf · view
Towards Empirical Roofline Modeling of Distributed Data Services: Mapping the Boundaries of RPC Throughput · pdf, pdf · view
DAOS - New Horizons for High Performance Storage · pdf · view
Stradling, Alden · morePragmatic Security Audits: Fortifying HPC Environments at a Consumable Pace · pdf, pdf · view
Strout, Michelle Mills · moreMARBLChapel: Fortran-Chapel Interoperability in an Ocean Simulation · pdf · view
Sukumar, Sreenivas · moreSearch and Query Framework for Workflows with HPC and AI Models · pdf, pdf · view
Sun, Chun · morePython Management · view
CPE Futures · view
Surjadidjaja, Vanessa · moreLDMS New Features for Deployment in Advanced Environments and Feedback for Operations · pdf · view
Szpindler, Maciej · moreUsing Different MPI Implementations on HPE Cray EX Supercomputers for Native and Containerized Applications Execution ​ · pdf · view

T
Tacchella, Davide · moreBuilding non-standard images for CSM systems · pdf, pdf · view
Separating concerns: Decoupling the Slingshot Fabric Manager from Cray System Management · pdf · view
Taheri, Ebad · moreCausality inference for Digital Twins in GPU Data Centers and Smart Grids. · pdf, pdf · view
Timalsina, Madan · moreAccelerating LArTPC Simulations: Enhancing larnd-sim with GPU Optimization Techniques · pdf · view
Tissieres, Jerome · moreAlpsB – a Geographically Distributed Infrastructure to Facilitate Large-Scale Training of Weather and Climate AI Models · pdf · view
Toonen, Brian · moreHarvesting, Storing and Processing Data from our HPCM Systems · pdf, pdf · view
Treger, Jesse · moreIntroduction To HPE Slingshot NIC Libfabric Environment Variables · pdf · view
Tripathy, Aalap · moreFramework for tracking metadata, lineage and model provenance in hybrid simulation-AI HPC exascale workflows · pdf, pdf · view
Tyler, Nicholas · moreAccelerating LArTPC Simulations: Enhancing larnd-sim with GPU Optimization Techniques · pdf · view

U
Upton, Alex · moreAlpsB – a Geographically Distributed Infrastructure to Facilitate Large-Scale Training of Weather and Climate AI Models · pdf · view
Upton, Peter · moreAddressing Resource Constraints on Aurora with Admin Access Nodes · pdf, pdf · view
Harvesting, Storing and Processing Data from our HPCM Systems · pdf, pdf · view
Urwin, Ron · moreHPE Cray EX225a (MI300a) Blade Power Capping and HBM Page Retirement · pdf · view

V
Vanderwende, Brian · moreCPE Testing · view
VandeVondele, Joost · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
Vasudevan, Raghul · moreMonitoring HPE Cray HPC systems · pdf, gz, gz · view
Viessmann, Hans-Nikolai · moreInfrastructure as a Service with Strong Tenant Separation on a Supercomputer · pdf, pdf · view

W
Waldron, Doug · moreHarvesting, Storing and Processing Data from our HPCM Systems · pdf, pdf · view
Walker, Chris · moreE2000 Performance From Microbenchmarks to Applications · pdf, pdf · view
Walker, Dennis · moreFrom Weeks to Hours: Harnessing Configuration Management and Deployment Pipelines · pdf, pdf · view
Rev Up Compute Node Reboots: 2x to 5x Faster · pdf, pdf · view
Dynamic Network Perimeterization: Isolating Tenant Workloads With VLANs, VNIs, & ACLs · pdf, pdf · view
Pragmatic Security Audits: Fortifying HPC Environments at a Consumable Pace · pdf, pdf · view
Building non-standard images for CSM systems · pdf, pdf · view
CSM updates, iSCSI boot content projection, and other CSM topics · pdf · view
Walton, Sara · moreLDMS New Features for Deployment in Advanced Environments and Feedback for Operations · pdf · view
Warner, Andy · moreBuilding non-standard images for CSM systems · pdf, pdf · view
Wayth, Randal · morePorting Radio Astronomy Correlation to Setonix, a HPE Cray EX system powered by AMD GPUs · pdf, pdf · view
Wazirzada, Isa · moreBuilding non-standard images for CSM systems · pdf, pdf · view
Rethinking Interactive HPC Resource Access: Enhancing Security and Flexibility · pdf, pdf · view
Separating concerns: Decoupling the Slingshot Fabric Manager from Cray System Management · pdf · view
Hardware Triage Tool: Enhancements and Extensions · pdf · view
Welch, Aaron · moreQuantifying Message Aggregation Optimisations for Energy Savings in PGAS Models · pdf, pdf · view
Welch, Steve · moreEnhancing RPC on Slingshot for Aurora’s DAOS Storage System · pdf, pdf · view
West, Karlon · moreSearch and Query Framework for Workflows with HPC and AI Models · pdf, pdf · view
White, Joseph P. · moreUtilization and Performance Monitoring of Ookami, an ARM Fujitsu A64FX Testbed Cluster with XDMoD · pdf · view
Wichmann, Nathan · moreDesigning GPU-aware OpenSHMEM for HPE Cray EX and XD Systems · pdf, pdf · view
Wickberg, Tim · moreSharing is Caring: Tackling Node-Sharing Challenges at CUG Sites · pdf · view
Slinky: The Missing Link Between Slurm and Kubernetes · pdf · view
Wilde, Torsten · moreEVeREST: An Effective and Versatile Runtime Energy Saving Tool for GPUs · pdf · view
Wilkinson, Callum · moreAccelerating LArTPC Simulations: Enhancing larnd-sim with GPU Optimization Techniques · pdf · view
Witlox, Pim · moreEvolving HPC services to enable ML workloads on HPE Cray EX · pdf, pdf · view
Woodacre, Michael · moreScaling MPI Applications on Aurora · pdf, pdf · view

X
Xu, Cong · moreFramework for tracking metadata, lineage and model provenance in hybrid simulation-AI HPC exascale workflows · pdf, pdf · view

Y
Yue, Anna · moreEVeREST: An Effective and Versatile Runtime Energy Saving Tool for GPUs · pdf · view

Z
Zambrino, Fabio · moreExperimenting with Security Compliance Checking using ReFrame · pdf, pdf · view
Zandstein, Becca · moreNVIDIA HPC Software - Expanding HPC with Python & AI · pdf · view
Zhan, Xin · moreA Full Stack Framework for High Performance Quantum-Classical Computing · pdf, pdf · view
Zhang, Micheal · moreHarvesting, Storing and Processing Data from our HPCM Systems · pdf, pdf · view
Ziemba, Ian · moreEnhancing RPC on Slingshot for Aurora’s DAOS Storage System · pdf, pdf · view
Introduction To HPE Slingshot NIC Libfabric Environment Variables · pdf · view
Slingshot Host Software Ethernet Tuning · pdf · view
Zingale, Michael · moreKeynote: What I’ve Learned About Supercomputing from Blowing Up Stars, Michael Zingale (Stony Brook University) · view

Created 2025-5-15 2:51