2024 InPEx Workshop

June 17-19, 2024 – Sitges (Spain)

Workshop Presentation

Following the previous first InPEx meeting and the InPEx workshop held in Reims in 2023, the InPEx community will meet in June 2024 in Sitges (Spain). The workshop will gather around 100 experts in the HPC fields from the European Union, Japan and the United States.

Workshop topics

The 2024 InPEx workshop builds on the results of the InPEx working groups and will address important topics of the current development of the Post-Exascale era:

AI for Scientific Computing
Software production and management
Co-design, benchmarks, mini-apps, proxy and evaluation
Digital continuum and data management

Date and place

The workshop will start at noon on Monday 17th of June and will and at noon on Wednesday 19th of June.

The workshop will be held in Sitges (Spain) at the Calipolis Hotel – Av. Sofía 2-6 08870 Sitges.

A detailed agenda of the event can be read below.

All practical details may be found on a dedicated website.

This workshop is hosted by the Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS), with the support of the French program NumPEx (Numeric for Exascale).

Agenda

Timeline

12:00 - 13:00 - Welcome & Lunch (on-site)

13:00-15:30 - Introduction and panel presentations

Context and Objectives of the meeting

Pete BECKMAN (ANL)

Jean-Yves BERTHOU (NumPEx)

Sergi GIRONA (BSC/CNS)

Welcome, context and objectives, agenda

Presentations and exchanges on the state of the art and general roadmaps for Exascale and post-Exascale projects/programs:

USA (40min)

Dilma DA SILVA (NSF)

Ceren SUSUT-BENNETT (DoE)

Balancing priorities in a very fast-moving field – Ceren SUSUT-BENNETT

LCCF and NAIRR – Dilma DA SILVA

Europe (40 min)

Panagiotis TSARCHOPOULOS (European Commission)

Torsten HOEFLER (ETH Zürich)

Mark PARSONS (EPCC)

European Commission – Panagiotis TSARCHOPOULOS

Switzerland’s software, application and infrastructure (post-)Exascale roadmap in 10 minutes – Torsten HOEFLER

UK Large-scale compute strategy – Mark PARSONS

Japan (40 min)

Masaaki KONDO (Riken)

Satoshi MATSUOKA (Riken)

Exascale and Post Exascale Computing in Japan – Satoshi MATSUOKA

Part 1

Part 2

Part 3

15:30 - 16:00 - Coffee break and group photo

16:00-19:00 - Presentation and discussion on the InPEx working groups results and achievements since the Reims (France) InPEx 2023 workshop (45min each)

#1 – Software production and management

Introduction – Bruno RAFFIN

Feasibility study on system software for FugakuNext – Kento SATO

Software production and management – US Update – Todd GAMBLIN

#2 – Co-design, benchmarks/mini-Apps/Proxy and evaluation (HW & SW & Applications)

Co-design/Co-development of community-driven set of Motifs-based software components and proxy/mini-apps streamlined with performance analysis tools and methodologies – Jean-Pierre VILOTTE

Co-Design in ECP – Anshu DUBEY

Scientific Benchmarking: An update from Japan / R-CCS – Jens DOMKE

Software co-design & co-development with industry – Michele WEILAND

#3 – Digital Continuum and Data management

Digital continuum and data management – François BODIN

#4 – AI for Scientific Computing (ML, LLM for science, open models and datasets for AI training)

AI Sessions – Jérôme BOBIN

AI for Science: DOE Has Been Gathering Wide Community Input – Pete BECKMAN

European AI Innovation Package – Cécile HUET

Briefing on AI for Science Plans in RIKEN – Mohamed WAHIB

9:00-10:30 - Breakout session A – State of the art, breakthrough identification and action plan

Subgroup session A:

A1 – AI for Scientific Computing (ML, LLM for science, open models and datasets for AI training)

Introduction – Jérôme BOBIN, Mohamed WAHIB, Pete BECKMAN

AI4Science Benchmark – Franck CAPELLO

Use of local LLMs for research code development – Naohito NAKASATO

Efficient training: memory management and perspectives – Olivier BEAUMONT

A2 – Software production and management

Dependable software deployment with Guix – Ludovic COURTES

Spack overview and update – Todd GAMBLIN

Introduction to EESSI – Alan O’CAIS

10:30 - 11:00 - Coffee break

11:00-12:30 - Breakout session A – State of the art, breakthrough identification and action plan

Subgroup session A:

A3 – Digital Continuum and Data management

Introduction

Digital Continuum initiatives – ETP4HPC – Sai NARASIMHAMURTHY

Edge to Cloud: a view from the trenches – Kate KEAHEY

High-precision, high-performance real-time digital twin for power networks – Francesc LORDAN

SKA: data intensive science across the continuum – Damien GRATADOUR

Edge computing with custom hardware platform in RIKEN – Kentaro SANO and Kento SATO

A4 – Co-design, benchmarks/mini-Apps/Proxy and evaluation (HW & SW & Applications)

SeisSol-Proxy: a mini-app for performance portability of SeisSol – Michael BADER

12:30 - 13:30 - Lunch

13:45-16:00 - Breakout session B – Follow-up sessions

Subgroup session B:

B1 – AI for Scientific Computing (ML, LLM for science, open models and datasets for AI training)

An example of AI4Science: ML4CFD – Guillaume CHARPIAT

AI for climate data generation, assimilation and modeling – Torsten HOEFLER

B2 – Software production and management

CASTIEL 2 – EuroHPC Pilot CI/CD platform – Dennis HOPPE

Software management in Fugaku – Hitoshi MURAI

16:00 - 16:30 - Coffee break

16:30-18:00 - Breakout session B – Follow-up sessions

Subgroup session B:

B3 – Digital Continuum and Data management

Destination Earth – A systems of systems perspective – Thomas GEENEN

Unified data abstractions for scientific workflow composition in the computing continuum – Silvina CAINO-LORES

Harnessing the HPC-Cloud-Edge continuum for urgent science: application drivers and research challenges – Daniel BALOUEK and Manish PARASHAR

Potential AI@Edge science applications – Rajesh SANKARAN

Enabling the Edge-Cloud-HPC data continuum – Gabriel ANTONIU

B4 – Co-design, benchmarks/mini-Apps/Proxy and evaluation (HW & SW & Applications)

NumPEx/Exa-MA mini-apps/proxy-apps – Christophe PRUD’HOMME

19:00-23:00 - Social event & Dinner

9:00-11:00 - Presentation of Subgroups wrap-up and discussion: (30 min each)

Software production and management – Bruno RAFFIN

Co-design, benchmarks, mini-apps, proxy and evaluation – Jean-Pierre VILOTTE

Digital continuum and data management – François BODIN

AI for Scientific Computing – Pete BECKMAN

11:00-11:45 - Exchange and discussion on future international/regional collaborations

Future international/regional collaborations and wrap-up of the workshop – Jean-Yves BERTHOU and Satoshi MATSUOKA

11:45-12:00 - Wrap-up

Preparing the InPEx workshop series: expected outcomes and organisation

12:00 - 13:00 - Lunch

AI for Scientific Computing

Work group co-leaders

Pete BECKMAN (US)
Jérôme BOBIN (EU-FR)
Mohamed WAHIB (JP)

Work group description

The goal of the AI for Scientific Computing group is to advance the field of Scientific Artificial Intelligence by fostering an environment of collaboration and innovation among core AI teams at international level. The group aims to share practical use-cases and benchmarks, including datasets and metrics, that underline the critical aspects of AI in scientific domains.

The group will focus on key strategies for advancing AI in scientific fields. An emphasis will be put in the formation of collaborative teams to develop trustworthy, scientifically validated AI models with clear, reproducible results. The approach of the group includes advocating for common practices and frameworks essential for creating foundation models, alongside defining and sharing use-cases with access to crucial datasets and metrics. The initiatives also extend to creating challenges to foster new community involvement and developing AI models for handling complex scientific data. Efforts will also be done to integrate AI with HPC to improve scalability and extend the scope of AI solutions to include both large-scale systems and smaller, hybrid models.

Co-Design, benchmarks and mini-apps

Work group co-leaders

Anshu DUBEY (US)

Masaaki KONDO (JP)

Jean-Pierre VILOTTE (EU-FR)

Work group description

The upcoming conference session is designed to enhance the exchange of innovative software components and development methods, particularly within computational mathematics and software engineering. It aims to promote international collaboration on software development kits (SDKs) geared towards application-driven purposes, emphasizing collaborative continuous integration and performance analysis. The session will also focus on efficient discretization of Partial Differential Equations (PDEs) and Block-structured Adaptive Mesh Refinement (AMR) at the exascale level.

Our goals include optimizing and expanding the use of existing software tools found in ECP SDKs like CEED and AMReX. We plan to form teams and communication networks to enhance software interoperability and introduce new frameworks. The session will feature the development of standardized proxy-apps and mini-apps within these SDKs to facilitate shared methodologies for performance assessment and the dissemination of insights. Additionally, it will underscore the significance of refining performance portable programming models such as Kokkos to meet application-specific needs and support sustainable software initiatives like the High-performance Software Foundation (HPSF). Ultimately, we seek to advance exascale application methodologies through the use of superior software components and cutting-edge programming models.

Digital Continuum and Data Management

Work group co-leaders
Gabriel ANTONIU (EU-FR)
François BODIN (EU-FR)
Kate KEAHEY (US)
Kentaro SANO (JP)

Work group description

The “Digital Continuum and Data Management” subgroup held its first reflection session in the Reims meeting, 2023. This subgroup focuses on the continuum of technologies and practices related to managing data from collection to processing and storage, including edge computing and cyber-physical systems.

Currently, the Digital Continuum is primarily being driven by large cloud providers, while high-performance computing (HPC) centers are competing with these providers to offer alternative solutions for managing and processing large datasets. One of the challenges facing HPC centers is demonstrating their value proposition compared to cloud providers, especially when it comes to real-time data processing requirements.

Another challenge for the Digital Continuum is that it is a multi-tenant environment where collected data is used for multiple purposes and computing infrastructure is shared. As such, building trust in this continuum of entities is crucial and the related issue of cybersecurity becomes central.

To address these challenges, the subgroup Digital Continuum proposes for the June 2024 INPEX meeting to focus on building a continuum of trusted entities as a proof-of-concept, using multiple infrastructures. The key to achieving this goal is finding good candidate applications that can demonstrate the value and capabilities of the Digital Continuum.

To this end, the subgroup proposes to discuss potential approaches using real use-cases during the meeting.

Software production

Work group co-leaders
Bruno RAFFIN (EU)
Other work group co-leaders will be displayed soon

Work group description

Exascale and Post-Exascale applications are becoming increasingly difficult to build, deploy and maintain under the double pressure of the growing machine complexity and the application needs to combine multiple compute and data processing paradigms (HPC+HPDA+AI). To address these challenges, our community needs to foster HPC dev-ops methodologies and tools to enhance productivity and enforce interoperability, functional and performance portability as well as reproducibility. Topics of disucssion include (but not limited to):

Package Managers and Containers – Empowering users beyond Modules
Shared Repos, Build Farms and Caches for Modules/Packages/Containers – Frictionless Access – Security
CI/CD at Exascale – Testing – Storage and Compute Costs
Automation – HPC dev-ops – Everything-as-Code – Everything-as-a-Service
New software stacks (AI)
Legacy code – Fortran compilers – LLVM
AI-assistants for development, debugging, documentation,…
Catalogs and Metadata
Debugging at Exascale
Componentisation – Interoperability – Standardization – Common software stacks
Software Sustainability