Modern molecular dynamic simulations steadily shift the scaling
bottleneck from computation towards communication.
The main bottleneck of such simulations are long-range interactions that
cannot be computed from local information only.
Any algorithm computing the equivalent of such long-range pairwise
interactions needs global information by design which leads to global
Especially for strong scaling with only a few particles per core such
Coulomb solvers are already communication/synchronization-bound.
To tackle these issues we are developing a parallel Fast Multipole
Method (FMM) in modern C++.
To reduce parallelization overhead, e.g. synchronization points or load
imbalance, algorithm-aware strategies have to be applied.
Such measures will improve performance, especially for a tasking
approach with dependency resolving and work scheduling.
Implementing those specific strategies in a scheduler and dependency
resolver of a third party library could be quite challenging.
Also, relying solely on universal dynamic scheduling implementations
could affect performance unfavorably.
The current C++ language standard (C++11) offers several robust features
for parallel intranode programming.
With the help of those standardized C++ features, we added a tasking
layer to our FMM library.
In this talk we want to present, which C++11 features are most suited
for tasking and how we apply and tailor such schemes for our purposes.
A C++ based MPI-enabled Tasking Framework to Efficiently Parallelize Fast Multipode Methods for Molecular Dynamics