Betriebssysteme · Institut für Systemarchitektur · Fakultät Informatik · TU Dresden

28. 11. 2014

Dynamic Load Balancing of High Performance Computing Applications

Matthias Lieber

TU Dresden

Load balance is crucial for an efficient usage of high performance computers (HPC). Unbalanced workload of parallel processes leads to waiting time at synchronization points and thus to waste of computing resources. Many HPC simulation applications are based on partial differential equations that are discretized in space and time to allow the approximate numerical solution of the problem (e.g. weather forecast, crash test simulations, computational fluid dynamics). The problem is partitioned among the space dimensions for parallelization and integrated forward into time, with periodic communication between the processes. The partitioning influences the load balance among the processes as well as the communication costs. If the workloads vary in space and time, a balanced static partitioning is not possible and dynamic load balancing methods need to be applied. The perfect solution is an NP-complete problem, which additionally needs to be solved as fast as possible, so heuristics with different trade-offs are used in practice. In this talk, I will give an overview of the dynamic load balancing problem in general and discuss various methods that have been developed, like graph-based methods and space-filling curves. Additionally, I will talk about my own experiences in dynamic load balancing a complex weather model and the challenges to reach scalability up to 256ki cores.
25. Jun 2020
· Copyright © 2001-2019 Operating Systems Group, TU Dresden | Impressum ·