Webrelaunch 2020

Numerical Linear Algebra for Scientific High Performance Computing (Winter Semester 2020/21)

In this course, we will cover different aspects of scientific high performance computing:

  • Methods and algorithms for scientific HPC
  • Programming models and languages for HPC (CUDA, HIP, OpenMP, DPC++, MPI)
  • Understanding and optimizing performance on HPC systems
  • Sustainable Software Development

The course will be virtual: https://kit-lecture.zoom.us/j/67845776476

Lecture slides will be collected in the git repo: https://git.scc.kit.edu/nla4hpc_winter20-21/lecture-slides.git

If you want to take this course, please send an email to hartwig.anzt@kit.edu

Lecture: Monday 10:00-11:30 Online
Lecturer Dr. Hartwig Anzt
Office hours:
Room 3.017 Kollegiengebäude Mathematik (20.30)
Email: hartwig.anzt@kit.edu

Shared Memory/Distributed Memory, Bulk-Synchronous Programming Model, Synchronization, Mutex, One-sided Communication, OpenMP, Fork-Join Model, Private/Public Variables, Map-Reduce, Scheduling, MPI, CUDA (GPU Programming)
GFLOPs, Moore’s Law, Amdahl’s Law, Performance tools, Performance Modeling, Roofline Model

Dense NLA:
BLAS operations, LAPACK, ScaLAPACK, LU/QR/Cholesky decomposition, Singular Value Decomposition (SVD)

Iterative Sparse Linear Algebra:
Fix-Point Iteration, Kylov subspace methods, ILU preconditioning, Jacobi/block-Jacobi preconditioning, Finite differences (Laplace), Domain Decomposition Methods (Additive/Multiplicative Schwarz)



  • Focus on sustainable software development.
  • Homework will be peer-reviewed by other students and both the homework and the peer review performance will be graded.
  • Homework usually includes programming, performance experiments, and a report.
  • Programming in C/C++ (+ we learn CUDA, OpenMP, HIP)


  • Topic of general interest to the course.
  • The idea is to read three or four papers from the literature (references will be provided).
  • Implement the problem/application.
  • New ideas and extensions are welcome, as well as optimized implementations of existing algorithms.
  • I.e., the class project can be an HPC implementation of a different project you are working on (Master’s thesis, Seminar, etc.)
  • I will also distribute a list with project ideas.
  • Synthesize them in terms of a report (~10-15 pages).
  • Present your report to class (~30 mins).


The Sourcebook of Parallel Computing, Edited by Jack Dongarra, Ian Foster, Geoffrey Fox, William Gropp, Ken Kennedy, Linda Torczon, Andy White, 2002, 760 pages, ISBN 1-55860-871-0, Morgan Kaufmann Publishers.

Introduction to High-Performance Scientific Computing, by Victor Eijkhout with Edmond Chow, Robert Van De Geijn, February 2010.

Introduction to High Performance Computing for Scientists and Engineers, by Georg Hager, Gerhard Wellein, CRC Press, 2010.