Debugging MPI library bug with minimized bug-triggering MPI trace
Naiji Ma
TU Dresden
Verteidigung der Master-Arbeit
Library bug could severely affect effectivity, correctness and productivity of
large-scale computing system. Debugging it exhausts programmer quickly as there
are superfluous event data needed to be examined manually. Delta Debugging is a
general technique for minimizing failure-inducing changes, which could also be
applied to do trace simplification. In this talk I will introduce trace
simplification in terms of MPI library bug. The proposed technique takes a MPI
application and runs it on a buggy MPI library with certain types of library
bug. Trace file of the application only contains executed MPI function calls
and is pruned with Delta Debugging. The result is a minimized trace file which
still triggers the original library bug. The empirical evaluation shows that it
could successfully isolate some simple bugs. I would briefly talk about how to
extend this technique to other types of bug.