====== Newsletter ====== == Announcements of HPC events for users in Frankfurt == [[https://www.hkhlr.de/|The HKHLR]] regularly organizes High Performance Computing (HPC) events aimed at researchers and students in the region of Hessen. If you are interested in receiving information about HPC events, like courses, tutorials or workshops, please [[https://dlist.server.uni-frankfurt.de/mailman/listinfo/hpc-frankfurt-events|subscribe to the HPC-frankfurt-events newsletter]]. ====== Events 2020 ====== ==== VIHPS Tuning Workshop: December 2020 ==== **Monday-Friday, Dec 07th - Dec 11th 2020** The workshop will be held in English and run from 09:00 to not later than 18:00 each day, with breaks, all times given as CEST (UTC+1). **Location:** Virtual using the Zoom platform. == Goal == This workshop organised by [[https://www.vi-hps.org/training/tws/tw37.html|VI-HPS]] and CSC/HKHLR will: * give an overview of the VI-HPS programming tools suite * explain the functionality of individual tools, and how to use them effectively * offer hands-on experience and expert assistance using the tools On completion participants should be familiar with common performance analysis and diagnosis techniques and how they can be employed in practice (on a range of HPC systems). Those who prepared their own application test cases will have been coached in the tuning of their measurement and analysis, and provided optimization suggestions. == Program == Presentations and hands-on sessions are planned on the following topics: Setting up, welcome and introduction ++++TAU performance system| **TAU** TAU is an integrated parallel performance framework for the instrumentation, measurement, analysis, and visualization of large-scale parallel computer systems and applications. It provides a flexible, robust, and portable tools platform that supports profiling and tracing for performance parallel evaluation across all leading programming models and environments. **Programming models** C, %%C++%%, Fortran, Java, Python, MPI, OpenMP **License** Open source: New BSD **Organizations** University of Oregon **Homepage** http://tau.uoregon.edu ++++ ++++ MAQAO performance analysis & optimisation| **MAQAOO** MAQAO (Modular Assembly Quality Analyzer and Optimizer) is a performance analysis and optimisation framework operating at binary level, with a focus on core performance. Its main goal is to guide application developers along the optimization process through synthetic reports and hints. The tool mixes both dynamic and static analyses based on its ability to reconstruct high level structures such as functions and loops from an application binary. Since MAQAO operates at binary level, it does not require recompiling the application to perform analyses. MAQAO assesses the code quality of the most time-consuming loops and provides a best-case estimation of the performance that can be reached, along with some hints on how to achieve it in terms of source code transformations, compiler flags, pragmas, etc.. **Programming models** Agnostic to programming models (working on the binary level). Mostly useful for single-node performance, but works also with PThreads/OpenMP and MPI. **License** Open source: LGPLv3 (planned) **Organizations** LRC ITACA / Université de Versailles St-Quentin-en-Yvelines **Homepage** http://maqao.org/ ++++ ++++ Score-P instrumentation and measurement| **Score-P** The Score-P measurement infrastructure is a highly scalable and easy-to-use tool suite for profiling, event trace recording, and online analysis of HPC applications. Score-P offers the user a maximum of convenience by supporting a number of analysis tools. Currently, it works with Periscope, Scalasca, Vampir, and Tau and is open for other tools. Score-P comes together with the new Open Trace Format Version 2, the CUBE4 profiling format and the Opari2 instrumenter. **Programming models** Serial, OpenMP, MPI, and hybrid (MPI+OpenMP) **License** Open source: BSD **Organizations** SILC Partners **Homepage** http://www.score-p.org ++++ ++++ Scalasca automated trace analysis| **Scalasca** Scalasca is an open-source toolset that can be used to analyze the performance behavior of parallel applications and to identify opportunities for optimization. It has been specifically designed for use on large-scale systems including IBM Blue Gene and Cray XT, but is also well-suited for small- and medium-scale HPC platforms. Scalasca integrates runtime summaries with in-depth studies of concurrent behavior via event tracing. A distinctive feature is the ability to identify wait states that occur, for example, as a result of unevenly distributed workloads. **Programming models** MPI and OpenMP **License** Open source: New BSD **Organizations** Forschungszentrum Jülich and German Research School for Simulation Sciences **Homepage** http://www.scalasca.org ++++ ++++ VAMPIR interactive trace analysis| **VAMPIR** The VAMPIR software tool provides an easy-to-use framework that enables developers to quickly display and analyze arbitrary program behavior at any level of detail. The tool suite implements optimized event analysis algorithms and customizable displays that enable fast and interactive rendering of very complex performance monitoring data. **Programming models** MPI, OpenMP, Pthreads, CUDA, Java Threads **License** Commercial **Organizations** Technische Universität Dresden **Homepage** http://www.vampir.eu ++++ ++++ LIKWID performance tool suite| **LIKWID** LIKWID is a tool suite for performance-oriented programmers offering command line tools for system topology, CPU/task affinity, hardware performance monitoring, micro-benchmarking and more. Besides the command line tools tools, almost all functionality is provided as a C library to be integrable in other tools. LIKWID is internationally widely used by many programmers for code modernization and computing centers for performance engineering and system monitoring. The easy-to-use interface with validated event sets plus derived metrics offers valuable input for users. **Programming Models** C/%%C++%%, Fortran, Lua, Python (pylikwid), Java (http://tiny.cc/p7pdez) **License** Open source: GPLv3 **Organizations** Friedrich-Alexander-Universität Erlangen-Nürnberg **Homepage** https://hpc.fau.de/research/tools/likwid/ https://github.com/RRZE-HPC/likwid ++++ ++++ PAPI hardware performance counters| **PAPI** PAPI is a cross-platform interface to the hardware performance counters available on most modern microprocessors. In addition to defining a standard set of routines for configuring and accessing the counters, PAPI defines a common set of performance events considered most useful for application performance tuning. These events include operation and cycle counts, cache and memory access events, and branch behavior events. Most recently, PAPI has been extended to PAPI-C (component PAPI), which provides simultaneous access to multiple counter domains, including the previous on-processor counters as well as off-processor counters and sensors such as network counters and temperature sensors. **Programming models** Fortran and C calling interfaces **License** Open source: New BSD **Organizations** University of Tennessee **Homepage** http://icl.cs.utk.edu/papi/ ++++ ++++ Extra-P automated performance modeling| **Extra-P** Extra-P is an automatic performance modeling tool that supports the user in the identification of performance bugs. A performance bug is a part of the program whose behavior is unintentionally poor, that is, much worse than expected, with respect to an increase in processor count. Extra-P uses measurements of different performance metrics as an input to define the performance of code regions as a function of the number of processes (or another parameter). All it takes to search for scalability issues even in full-blown codes is to run a manageable number of small-scale performance experiments, launch Extra-P, and compare the extrapolated performance of the worst instances to expectations. Extra-P generates not only a list of potential bugs but human-readable models for all performance metrics available such as floating point operations count or bytes sent by MPI calls that can be further analyzed and compared to identify the root causes of performance issues. **Programming models** MPI and OpenMP **License** Open source: New BSD **Organizations** TU Darmstadt, Lawrence Livermore National Laboratory, and Forschungszentrum Jülich **Homepage** http://www.scalasca.org/software/extra-p ++++ ... and potentially others to be added A brief overview of the capabilities of these and associated tools is provided in the [[https://www.vi-hps.org/cms/upload/material/general/ToolsGuide.pdf| VI-HPS Tools Guide]]. == Agenda == **Day 1** - Monday, December 7th 9:00 - 10:30 : //Welcome and Introduction// * Welcome [Anja Gerbes, CSC] * Introduction to Zoom * The Goethe-HLR system [Anja Gerbes, CSC] * Introduction [Cedric Valensi, UVSQ] * Introduction to VI-HPS & overview of tools * Introduction to parallel performance engineering * Building and running NPB/BT-MZ on Goethe-HLR [] 10:30 – 11:00 Break 11:00 – 12:30 : //MAQAO// * //MAQAO// performance analysis tools [Jäsper Ibnamar & Emmanuel Oseret, UVSQ] * //MAQAO// hands-on exercises (//MAQAO// quick reference) 12:30 – 14:00 Lunch Break 14:00 - 15:30 : //MAQAO// * Hands-on coaching to apply //MAQAO// to analyze participants own code(s) 15:30 - 16:00 Break 16:00 - 17:30 : //TAU// * //TAU// performance system [Sameer Shende, UOregon] * //TAU// hands-on exercises 17:30 - 18:00 : Schedule for remainder of workshop **Day 2** - Tuesday, December 8th 9:00 - 10:30 : //PAPI// * //PAPI// hardware performance counters [Frank Winkler, UTK] * //PAPI// hands-on exercises 10:30 – 11:00 Break \\ 11:00 – 12:30 : //LIKWID// * //LIKWID// performance tool suite [Thomas Gruber, FAU] * //LIKWID hands-on exercises// 12:30 – 14:00 Lunch Break \\ 14:00 - 15:30 : //LIKWID// * Hands-on coaching to apply //LIKWID// to analyze participants own code(s) 15:30 - 16:00 Break \\ 16:00 - 17:30 : //TAU// * Hands-on coaching to apply //TAU// to analyze participants own code(s) 17:30 - 18:00 : Schedule for remainder of workshop **Day 3** - Wednesday, December 9th 9:00 - 10:30 : //Score-P/CUBE// * //Score-P// instrumentation & measurement toolset [Brian Wylie, JSC] * //Score-P// analysis scoring & measurement filtering * //Score-P// specialized instrumentation and measurement * //Score-P// hands-on exercises * CUBE profile explorer hands-on exercises [Anke Visser, JSC] 10:30 – 11:00 Break 11:00 – 12:30 : //Score-P/CUBE// * Hands-on coaching to apply //Score-P/CUBE// to analyze participants own code(s) 12:30 – 14:00 Lunch Break 14:00 - 15:30 : //Score-P/CUBE// * Hands-on coaching to apply //Score-P/CUBE// to analyze participants own code(s) 15:30 - 16:00 Break 16:00 - 17:30 : //TAU// * Hands-on coaching to apply //TAU// to analyze participants own code(s) 17:30 - 18:00 : Review of day and schedule for remainder of workshop **Day 4** - Thursday, December 10th 9:00 - 10:30 : //Scalasca/Vampir// * //Scalasca// automated trace analysis [Markus Geimer, JSC] * //Scalasca// hands-on exercises * //Vampir// interactive trace analysis [William Williams, TU Dresden] * //Vampir// hands-on exercises 10:30 – 11:00 Break 11:00 – 12:30 : //Scalasca/Vampir// * Hands-on coaching to apply //Scalasca/Vampir// to analyze participants own code(s) 12:30 – 14:00 Lunch Break 14:00 - 15:30 : //Scalasca/Vampir// * Hands-on coaching to apply //Scalasca/Vampir// to analyze participants own code(s) 15:30 - 16:00 Break 16:00 - 17:30 : //all tools// * Hands-on coaching to apply tools to analyze participants own code(s) 17:30 - 18:00 : Review of day and schedule for remainder of workshop **Day 5** - Friday, December 11th 9:00 - 10:30 : //Extra-P// * //Extra-P// automated performance modeling [Frank Ritter, TU Darmstadt] * //Extra-P// hands-on exercises 10:30 – 11:00 Break 11:00 – 12:30 : //all tools// * Hands-on coaching to apply tools to analyze participants own code(s) 12:30 – 14:00 Lunch Break 14:00 - 16:00 : //all tools// * Hands-on coaching to apply tools to analyze participants own code(s) == Tune your own code == Participants from Academia * are encouraged to prepare their own MPI, OpenMP and hybrid MPI+OpenMP parallel application codes for analysis. * who have a piece of their own code that they would like to adapt and speed up using VIHPS tools can send it to the organizers to include it in the course as a case study. Please make sure that it is already runnable on the Goethe-HLR cluster. If you are interested in tuning your code, please send us a short code description at least till Monday, November 23rd (). * who wants their code to be analyzed has to install their code to the workshop accounts on the Goethe-HLR. The installation will be possible starting on Monday, November 23rd. === Registration === To register, please use [[https://training.csc.uni-frankfurt.de/e/vihps|this webpage]]. For more information, please write to . Registrations from Hesse are preferred until 01.11.2020. During this time, it is possible for applicants from other regions to be placed on a waiting list. After 02.11.2020 all applications will be treated equally. ---- ==Contact== Anja Gerbes, +49 (0)69 798-47356, gerbes[at]csc.uni-frankfurt.de This course is organized by CSC, Goethe University Frankfurt in cooperation with VIHPS, HKHLR & UVSQ. [[https://csc.uni-frankfurt.de/|{{public:csc_logo.png?140}}]] [[https://www.vi-hps.org/|{{public:logo_vihps-blue.png?140}}]] [[https://www.hkhlr.de/en|{{public:hkhlr-logo-cmyk.jpg?140}}]] [[https://www.uvsq.fr/|{{public:logo-uvsq-2020-rvb.jpg?110}}]]