Intro to PGAS (UPC and CAF) and Hybrid for Multicore Programming

Alice Koniges, Berkeley Lab, NERSC

Katherine Yelick, UC Berkeley and Berkeley Lab, NERSC

Rolf Rabenseifner, High Performance Computing Center Stuttgart (HLRS)

Reinhold Bader, Leibniz Supercomputing Center Munich (LRZ)

David Eder, Lawrence Livermore National Laboratory (LLNL)

A full-day tutorial at SC10

Abstract

PGAS (Partitioned Global Address Space) languages offer both an alternative to traditional parallelization approaches (MPI and OpenMP), and the possibility of being combined with MPI for a multicore hybrid programming model. In this tutorial we cover PGAS concepts and two commonly used PGAS languages, Coarray Fortran (CAF, as specified in the Fortran standard) and the extension to the C standard, Unified Parallel C (UPC). Hands-on exercises to illustrate important concepts are interspersed with the lectures. Attendees will be paired in groups of two to accommodate attendees without laptops. Basic PGAS features, syntax for data distribution, intrinsic functions and synchronization primitives are discussed. Additional topics include parallel programming patterns, future extensions of both CAF and UPC, and hybrid programming. In the hybrid programming section we show how to combine PGAS languages with MPI, and contrast this approach to combining OpenMP with MPI. Details: https://fs.hlrs.de/projects/rabenseifner/publ/SC2010-PGAS.html  

 

Detailed Description

Tutorial goals

This tutorial represents a unique collaboration between the Berkeley PGAS/UPC group and experienced hands-on PGAS and hybrid instructors. Participants will be provided with the technical foundations necessary to write library or application codes using CAF or UPC, and an introduction to experimental techniques for combining MPI with PGAS languages.

The tutorial will stress some of the advantages of PGAS programming models including

·         potentially easier programmability and therefore higher productivity than with purely MPI-based programming due to one-sided communication semantics, integration of the type system and other language features included with the parallel facilities

·         optimization potential for the language processor (compiler + runtime system)

·         improved scalability compared to OpenMP at the same level of usage complexity due to better locality control

·         flexibility with respect to architectures – PGAS may be deployed on shared memory multi-core systems as well as (with some care required) on large-scale MPP architectures

The tutorial's strategy to provide an integrated view of both CAF and UPC will allow the audience to get a clear picture of similarities and differences between these two approaches to PGAS programming. Hybrid programming using both OpenMP and PGAS will be illustrated and compared.

Targeted Audiences and Relevance

The PGAS base is growing and targets a wide range of SC attendees. Application programmers, vendors and library designers coming from both C and Fortran backgrounds, will attend this tutorial. Multicore architectures are the norm now, from high end systems to desktops. This tutorial therefore addresses computer professionals with access to a very wide variety of programming platforms.

Content level

30% introductory, 40% intermediate, 30% advanced

Audience prerequisites

Participants should have knowledge of at least one of the Fortran 95 and C programming languages, possibly both, and be comfortable with running example programs in a Linux environment. Technical assistants and other personnel will be available for help with the exercises. In addition, a basic knowledge of traditional parallel programming models (MPI and OpenMP) is useful for the more advanced parts of the tutorial. Attendees will be paired in groups of two to accommodate attendees without laptops. If you have a laptop, a secure shell should be installed (e.g. OpenSSH or PuTTY) to be able to login on the parallel compute server that will be provided for the exercises, see also http://www.nersc.gov/nusers/help/access/ssh_apps.php .

General Description

After an introduction to general PGAS concepts as well as to the status of the standardization efforts, the basic syntax for declaration and use of shared data is presented; the requirements and rules for synchronization of accesses to shared data are explained (PGAS memory model). This is followed by the topic of dynamic memory management for shared entities. Then, advanced synchronizations mechanisms like locks, atomic procedures as well as collective procedures are discussed, as well as their usefulness for implementation of certain parallel programming patterns. The section on hybrid programming explains the way MPI makes allowances for hybrid models, and how this can be matched with PGAS-based implementations. Finally, still existing deficiencies in the present language definitions of CAF and UPC will be indicated; an outlook will be provided for possible future extensions, which are presently still under discussion among language developers, and should allow to overcome most of the above-mentioned deficiencies.

Description of Exercises for hands-on sessions

The hands-on sessions are interspersed with the presentations such that approximately one hour of presentation is followed by 30 minutes of exercises. The exercises will come from a pool of exercises that have been tested on courses given throughout Europe, as well as additional exercises for the newest material.

Presently planned examples include

·         basic exercises to understand the principles of UPC and CAF

·         parallelization of a matrix-vector multiplication

·         parallelization of a simple 2-dimensional jacobi code

·         parallelization of a ray tracing code

 

Detailed outline of the tutorial