Parsl at SC24
14 Nov 2024 - Sophie Bui
SC24, the international conference for high-performance computing, networking, storage, and analysis, will be held November 17β22 in Atlanta, GA. Several Parsl users and community members will present their work at SC24. See below for information about Parsl-related SC24 activities (all times shown in Eastern Standard Time). Check out the full conference program here.
Monday, November 18
Parsl+CWL: Towards Combining the Python and CWL Ecosystems
π£ Time: 11:21 β 11:44 a.m. | βΉοΈ Event Type: Workshop | π Location: B302 | π More details
Description: Common Workflow Language (CWL) is a widely adopted language for defining and sharing computational workflows. It is designed to be independent of the execution engine on which workflows are executed. Here, we describe our experiences integrating CWL with Parsl, a Python-based parallel programming library designed to manage execution of workflows across diverse computing environments. We propose a new method that converts CWL CommandLineTool definitions into Parsl apps, enabling Parsl scripts to easily import and use tools represented in CWL. We describe a Parsl runner capable of executing a CWL CommandLineTool directly. We also describe a proof-of-concept extension to support inline Python in a CWL workflow definition, enabling seamless use in Parsl's Python ecosystem. We demonstrate benefits of this integration by presenting example CWL CommandLineTool definitions that show how they can be used in Parsl, and comparing performance of executing an image processing workflow using Parsl-CWL and other CWL runners.
Tuesday, November 19
Seesaw: Elastic Scaling for Task-Based Distributed Programs
π Time: Noon β 5 p.m. | βΉοΈ Event Type: Poster | π Location: B302-B305 | π Author: Matthew Chung, UC Riverside
Description: Modern batch schedulers in HPC environments enable the shared use of available computational resources via provisioning discrete sets of resources matching user requirements. The lack of elasticity in such scenarios is often addressed using a Pilot job model where multiple separate requests are pooled. In this work, we explore computational elasticity in a popular Python-based workflow system: Parsl. We identify limitations in existing scaling logic and propose a new resource-aware scheduler. We show a significant improvement in the efficiency of compute resources consumed with minimal loss in time to solution.
Making Sense of the Chaos: Best Practices for HPC Software Sustainability Strategies and Metrics
π Time: 5:15 β 6:45 p.m. | βΉοΈ Event Type: BOF | π Location: B210 | π More details
Description: This BoF brings together participants to build a community and share information and ideas around exploring the challenges in sustainability of HPC software, focusing on 1) defining and measuring research software sustainability metrics and 2) enhancing research software stewardship practices. Participants will discuss software project community health, engineering practices, and funding stability. The session aims to foster collaboration, share insights, and develop actionable strategies for the long-term viability of HPC software projects.
Workflows Community: Collaborative Pathways for Designing an Integrated Infrastructure for Research Excellence
π Time: 5:15 β 6:45 p.m. | βΉοΈ Event Type: BOF | π Location: B212 | π More details
Description: This session will focus on the integration and scalability of AI-driven scientific workflows across facilities. Building on vibrant discussions from our previous SC BoF sessions, this session will address the challenges and opportunities inherent in multi-facility workflows. Key themes will include the coordination among various computing and experimental facilities, near real-time data processing, and enhancing infrastructure resilience. Participants will engage in collaborative brainstorming sessions to identify innovative solutions for data representation and storage challenges. This session aims to foster an environment of collaboration, driving the development of efficient and scalable workflows that support modern scientific researchβs growing complexity and scale.
Thursday, November 21
Serverless HPC: Challenges, Opportunities, and Future Prospects for Accelerated Cloud Computing
π₯ Time: 10:30 a.m. β Noon | βΉοΈ Event Type: Panel | π Location: B313B-B314 | π More details
Description: With the ongoing convergence of high-performance computing and cloud, HPC gains a chance to transform and improve its runtimes and programming models. HPC systems can increase their efficiency and accessibility by adapting elastic cloud paradigms, with the prime example being serverless computing. Serverless abstracts away resource management and introduces fine-grained allocations, allowing system operators to improve their efficiency with elastic containers (CaaS), functions (FaaS), and acceleration (XaaS). However, adopting serverless technologies brings challenges that have not been treated adequately in HPC: security of multi-tenant deployments, portability, and performance isolation in shared resources. In this interactive panel, experts from academia and national labs will debate how serverless can support the rigorous demands of HPC applications. They will share experiences of introducing elastic programming models into the rigid world of high-performance systems and outline predictions for the future: Will serverless schedulers become first-class citizens on HPC systems?
IRI: What Novel Interfaces Will HPC Expose for Cross-Facility Workflows?
π Time: 12:15 β 1:15 p.m. | βΉοΈ Event Type: BOF | π Location: B311 | π More details
Description: DOE has recently launched the Integrated Research Infrastructure (IRI) program, which is designed to enable new modes of integrated science across DOE user facilities. Common or unified interfaces are needed for these workflows to seamlessly orchestrate resources across high-performance computing, data, and network providers. These interfaces could be REST APIs for programmable workflows, expansive UIs like JupyterLab, or deep integration with external workflow orchestrators. This BoF will summarize the current efforts of the IRI interfaces working group and individual ASCR user facilities to develop new interfaces. We invite the community to provide feedback to help guide these IRI efforts.
Friday, November 22
Establishing a High-Performance and Productive Ecosystem for Distributed Execution of Python Functions Using Globus Compute
π£ Time: 8:40 β 9:10 a.m. | βΉοΈ Event Type: Workshop | π Location: B311 | π More details
Description: Democratizing access to the research computing ecosystem is critical for accelerating research progress. However, the gap between a high-level workload, such as Python in a Jupyter notebook, and the resources exposed by HPC systems is significant. Users must securely authenticate, manage network connections, deploy and manage software, provision and configure nodes, and manage workload execution. Globus Compute reduces these barriers by providing a managed, fire-and-forget model that enables execution of Python functions across any resource to which a user has access. In this paper we describe enhancements to Globus Compute that further reduce barriers to use of the research computing ecosystem: an asynchronous, future-based executor interface for submitting and monitoring tasks, shell and MPI-based function types, and a multi-user endpoint that can be deployed by administrators and used by authorized users.