Bit errors in memory and cache eventually occur and will be more common in
future systems. Through IPC messages even one bit errors can compromise
Error Detection in IPC Messages through Checksums
In other environments like network systems, checksum are successfully used to
detect transmissions failures. We took this approach and treated these bit
errors in IPC messages like transmissions failures. Different checksums and
implementations were tested for their error detection rate and their slowdown
for IPC messaging.
Solving problems that require a large compute capacity is usually done by
means of clusters consisting of many independent nodes. Writing programs for
clusters is more complicated than writing parallel applications for one node
due to the lack of shared memory. To remove this development barrier
Distributed Shared Memory (DSM) systems have been developed in the past but
turned out to be quite inefficient in practice. Nowadays new paradigms in
parallel programming such as asynchronous lambdas arise. A popular example for
this is Apple's Grand Central Dispatch.
Implementation of Distributed Shared Memory for Modern Paradigms of Parallel Programming in L4
To test the feasability of DSM using relaxed consistency models a DSM
infrastructure is being built on top of the Fiasco.OC / L4Re environment.
Different strategies to transfer memory content between the participating nodes