Numerical Linear Algebra for Scientific High Performance Computing (Summer Semester 2022)
- Lecturer: Dr. Hartwig Anzt
- Classes: Lecture (0110650), Problem class (0110660)
- Weekly hours: 2+2
Given the current situation, we will realize this lecture as a hybrid event.
The lecture will take place in 20.30 SR 2.58, the exercise session in 20.30 -01.09.
Both events will be streamed on
https://kit-lecture.zoom.us/j/64458880199
For further information, please contact hartwig.anzt@kit.edu
We will use C / C++ for programming. You are free to use Java or FORTRAN, if you prefer (I recommend C/C++, though!).
The lecture slides will be collected in the git repo
https://gitlab.com/nla4hpc/spring-2022/lecture-slides
The homework assignments will be available in the git repo
https://gitlab.com/nla4hpc/spring-2022/exercises.git
(You need to request access to the repository)
Schedule | ||
---|---|---|
Lecture: | Thursday 8:00-9:30 | 20.30 SR 2.58 |
Problem class: | Tuesday 11:30-13:00 | 20.30 -01.09 |
Lecturers | ||
---|---|---|
Lecturer, Problem classes | Dr. Hartwig Anzt | |
Office hours: | ||
Room 3.017 Kollegiengebäude Mathematik (20.30) | ||
Email: hartwig.anzt@kit.edu |
Content we will cover:
- Fundamentals of Parallel Processing
- Parallel Architectures (SIMD/SIMT/MIMD)
- Roofline Model,
- Arithmetic Intensity, Machine Balance
- Amdahl’s Law
- Data dependency/Flow dependency/Resource dependency
- Fork-Join, Bulk-Synchronous Programming Model (BSP), Task-based Model
- BLAS routines
- LAPACK
- LU Decomposition
- Cholesky Decomposition
- QR Decomposition
- Fix-Point Iterations
- Krylov Methods
- ILU Preconditioning
- Finite Differences
- Domain Decomposition Methods (Additive/Multiplicative Schwarz)
- Shared Memory / Distributed Memory
- Synchronization, Mutex, One-sided-Communication
- MPI, OpenMP, GPU programming (CUDA)
- Precision Formats and Mixed Precision Numerics
Examination
The success control takes place in the form of a project presentation and an oral exam of at least 30 minutes duration and evaluation of the written project report and grading of the performance in the exercises.
Composition of the module grade:
The overall grade for a different type of examination is formed as follows:
A total of 200 points can be achieved, that compose of
• a maximum of 60 points for the exercise sheets (10 per exercise sheet),
• a maximum of 60 points for the final presentation including an oral examination,
• A maximum of 80 points for the project implementation and project report.
To pass the success control, at least 140 points must be collected.
References
The Sourcebook of Parallel Computing, Edited by Jack Dongarra, Ian Foster, Geoffrey Fox, William Gropp, Ken Kennedy, Linda Torczon, Andy White, 2002, 760 pages, ISBN 1-55860-871-0, Morgan Kaufmann Publishers.
Introduction to High-Performance Computing for Scientists and Engineers, by Georg Hager, Gerhard Wellein, CRC Press, 2010.
Introduction to High-Performance Scientific Computing, Victor Eijkhout, http://theartofhpc.com/