Presentations and hands-on sessions are planned on the following topics:
Setting up, welcome and introduction
TAU performance system
TAU
TAU is an integrated parallel performance framework for the instrumentation, measurement, analysis, and visualization of large-scale parallel computer systems and applications. It provides a flexible, robust, and portable tools platform that supports profiling and tracing for performance parallel evaluation across all leading programming models and environments.
Programming models
C, C++, Fortran, Java, Python, MPI, OpenMP
License
Open source: New BSD
Organizations
University of Oregon
Homepage
http://tau.uoregon.edu
MAQAO performance analysis & optimisation
MAQAOO
MAQAO (Modular Assembly Quality Analyzer and Optimizer) is a performance analysis and optimisation framework operating at binary level, with a focus on core performance. Its main goal is to guide application developers along the optimization process through synthetic reports and hints. The tool mixes both dynamic and static analyses based on its ability to reconstruct high level structures such as functions and loops from an application binary. Since MAQAO operates at binary level, it does not require recompiling the application to perform analyses.
MAQAO assesses the code quality of the most time-consuming loops and provides a best-case estimation of the performance that can be reached, along with some hints on how to achieve it in terms of source code transformations, compiler flags, pragmas, etc..
Programming models
Agnostic to programming models (working on the binary level). Mostly useful for single-node performance, but works also with PThreads/OpenMP and MPI.
License
Open source: LGPLv3 (planned)
Organizations
LRC ITACA / Université de Versailles St-Quentin-en-Yvelines
Homepage
http://maqao.org/
Score-P instrumentation and measurement
Score-P
The Score-P measurement infrastructure is a highly scalable and easy-to-use tool suite for profiling, event trace recording, and online analysis of HPC applications. Score-P offers the user a maximum of convenience by supporting a number of analysis tools. Currently, it works with Periscope, Scalasca, Vampir, and Tau and is open for other tools. Score-P comes together with the new Open Trace Format Version 2, the CUBE4 profiling format and the Opari2 instrumenter.
Programming models
Serial, OpenMP, MPI, and hybrid (MPI+OpenMP)
License
Open source: BSD
Organizations
SILC Partners
Homepage
http://www.score-p.org
Scalasca automated trace analysis
Scalasca
Scalasca is an open-source toolset that can be used to analyze the performance behavior of parallel applications and to identify opportunities for optimization. It has been specifically designed for use on large-scale systems including IBM Blue Gene and Cray XT, but is also well-suited for small- and medium-scale HPC platforms. Scalasca integrates runtime summaries with in-depth studies of concurrent behavior via event tracing. A distinctive feature is the ability to identify wait states that occur, for example, as a result of unevenly distributed workloads.
Programming models
MPI and OpenMP
License
Open source: New BSD
Organizations
Forschungszentrum Jülich and German Research School for Simulation Sciences
Homepage
http://www.scalasca.org
VAMPIR interactive trace analysis
VAMPIR
The VAMPIR software tool provides an easy-to-use framework that enables developers to quickly display and analyze arbitrary program behavior at any level of detail. The tool suite implements optimized event analysis algorithms and customizable displays that enable fast and interactive rendering of very complex performance monitoring data.
Programming models
MPI, OpenMP, Pthreads, CUDA, Java Threads
License
Commercial
Organizations
Technische Universität Dresden
Homepage
http://www.vampir.eu
LIKWID performance tool suite
LIKWID
LIKWID is a tool suite for performance-oriented programmers offering command line tools for system topology, CPU/task affinity, hardware performance monitoring, micro-benchmarking and more. Besides the command line tools tools, almost all functionality is provided as a C library to be integrable in other tools. LIKWID is internationally widely used by many programmers for code modernization and computing centers for performance engineering and system monitoring. The easy-to-use interface with validated event sets plus derived metrics offers valuable input for users.
Programming Models
C/C++, Fortran, Lua, Python (pylikwid), Java (http://tiny.cc/p7pdez)
License
Open source: GPLv3
Organizations
Friedrich-Alexander-Universität Erlangen-Nürnberg
Homepage
https://hpc.fau.de/research/tools/likwid/
https://github.com/RRZE-HPC/likwid
PAPI hardware performance counters
PAPI
PAPI is a cross-platform interface to the hardware performance counters available on most modern microprocessors. In addition to defining a standard set of routines for configuring and accessing the counters, PAPI defines a common set of performance events considered most useful for application performance tuning. These events include operation and cycle counts, cache and memory access events, and branch behavior events. Most recently, PAPI has been extended to PAPI-C (component PAPI), which provides simultaneous access to multiple counter domains, including the previous on-processor counters as well as off-processor counters and sensors such as network counters and temperature sensors.
Programming models
Fortran and C calling interfaces
License
Open source: New BSD
Organizations
University of Tennessee
Homepage
http://icl.cs.utk.edu/papi/
Extra-P automated performance modeling
Extra-P
Extra-P is an automatic performance modeling tool that supports the user in the identification of performance bugs. A performance bug is a part of the program whose behavior is unintentionally poor, that is, much worse than expected, with respect to an increase in processor count. Extra-P uses measurements of different performance metrics as an input to define the performance of code regions as a function of the number of processes (or another parameter). All it takes to search for scalability issues even in full-blown codes is to run a manageable number of small-scale performance experiments, launch Extra-P, and compare the extrapolated performance of the worst instances to expectations.
Extra-P generates not only a list of potential bugs but human-readable models for all performance metrics available such as floating point operations count or bytes sent by MPI calls that can be further analyzed and compared to identify the root causes of performance issues.
Programming models
MPI and OpenMP
License
Open source: New BSD
Organizations
TU Darmstadt, Lawrence Livermore National Laboratory, and Forschungszentrum Jülich
Homepage
http://www.scalasca.org/software/extra-p
… and potentially others to be added
A brief overview of the capabilities of these and associated tools is provided in the VI-HPS Tools Guide.
Day 1 - Monday, December 7th
9:00 - 10:30 : Welcome and Introduction
Welcome [Anja Gerbes, CSC]
Introduction to Zoom
The Goethe-HLR system [Anja Gerbes, CSC]
Introduction [Cedric Valensi, UVSQ]
Introduction to VI-HPS & overview of tools
Introduction to parallel performance engineering
Building and running NPB/BT-MZ on Goethe-HLR []
10:30 – 11:00 Break
11:00 – 12:30 : MAQAO
MAQAO performance analysis tools [Jäsper Ibnamar & Emmanuel Oseret, UVSQ]
MAQAO hands-on exercises (MAQAO quick reference)
12:30 – 14:00 Lunch Break
14:00 - 15:30 : MAQAO
15:30 - 16:00 Break
16:00 - 17:30 : TAU
17:30 - 18:00 : Schedule for remainder of workshop
Day 2 - Tuesday, December 8th
9:00 - 10:30 : PAPI
10:30 – 11:00 Break
11:00 – 12:30 : LIKWID
12:30 – 14:00 Lunch Break
14:00 - 15:30 : LIKWID
15:30 - 16:00 Break
16:00 - 17:30 : TAU
17:30 - 18:00 : Schedule for remainder of workshop
Day 3 - Wednesday, December 9th
9:00 - 10:30 : Score-P/CUBE
Score-P instrumentation & measurement toolset [Brian Wylie, JSC]
Score-P analysis scoring & measurement filtering
Score-P specialized instrumentation and measurement
Score-P hands-on exercises
CUBE profile explorer hands-on exercises [Anke Visser, JSC]
10:30 – 11:00 Break
11:00 – 12:30 : Score-P/CUBE
12:30 – 14:00 Lunch Break
14:00 - 15:30 : Score-P/CUBE
15:30 - 16:00 Break
16:00 - 17:30 : TAU
17:30 - 18:00 : Review of day and schedule for remainder of workshop
Day 4 - Thursday, December 10th
9:00 - 10:30 : Scalasca/Vampir
Scalasca automated trace analysis [Markus Geimer, JSC]
Scalasca hands-on exercises
Vampir interactive trace analysis [William Williams, TU Dresden]
Vampir hands-on exercises
10:30 – 11:00 Break
11:00 – 12:30 : Scalasca/Vampir
12:30 – 14:00 Lunch Break
14:00 - 15:30 : Scalasca/Vampir
15:30 - 16:00 Break
16:00 - 17:30 : all tools
17:30 - 18:00 : Review of day and schedule for remainder of workshop
Day 5 - Friday, December 11th
9:00 - 10:30 : Extra-P
10:30 – 11:00 Break
11:00 – 12:30 : all tools
12:30 – 14:00 Lunch Break
14:00 - 16:00 : all tools