Hello!

Sat Feb 15 06:18:13 CET 2014

On Fri Feb 14, 2014 at 10:39:49 -0800, Blaine Garst wrote:
> At first glance I suspect that my architectural work will improve L4
> IPC times.
> 
> The premise is/was that threads don’t belong to address spaces but
> instead wander with the IPC from one address domain to another
> carrying their arguments in registers.  

You’re talking about a migrating-threads model. Bryan Ford implemented that in Mach in the ‘90s [1], it improved Mach IPC (from a very low baseline), but still not even close to L4’s. (And note that they don’t compare to L4, bit of a benchmarking crime…) Pebble [2] was a from-scratch kernel using a migrating threads model, it got within 10% of L4 IPC performance but not better. More recently Gabe Parmer’s and Rich West’s Composite OS [3] tried the same, their IPC costs are also higher than L4’s.

> IPC is a trap, adjust mmu,
> proceed.  If the IPC is carrying an IPC end-point, e.g. a capability,
> its a different trap and some bookkeeping must be done, but it can
> also be blindingly fast.  The hard question is and was, well, if you
> don’t have a blocking thread waiting for the IPC, how do you manage
> all these spontaneous “up-calls”.

You’ll find that it ain’t that easy. On the one hand, L4 IPC is designed to be little more than a context switch, so, as Adam says, there isn’t much to shave off. (In fact, about 10–15 years ago, when we were building Mungi on L4, some of my students argued that we should be moving to a kernel with a migrating threads model as this would map more efficiently onto Mungi’s migrating threads model. But when going through the operations that needed to be performed, no-one could show me how it would end up faster than using L4.) 

On the other hand, you have to do considerable more than switching page tables. In particular, while logically the thread continues executing on its old stack, in reality that doesn’t work: the thread switches protection domains, and its old stack is no longer accessible. While logically, the whole stack moves between protection domains, in practice, this means that you need to provide a new stack on the fly. Obviously, the stack will be cached, so it can be re-used on a repeat call, but it isn’t as easy as only changing the page table. 

And, there is no guarantee (except if you’re in a single-address-space OS like Mungi) that you actually *can* allocate a new stack where you need it: as you’re switching to a new AS, the address range used by the original stack might be in use by something else, which means you’re hosed.

Plus, maintaining a cache of stacks introduces resource-management policies into the kernel, in violation of microkernel principles.

Gernot

[1] Bryan Ford and Jay Lepreau, Evolving Mach 3.0 to a Migrating Thread Model, USENIX Winter, 1994

[2] Eran Gabber, Christopher Small, John Bruno, José Brustoloni and Avi Silberschatz, The Pebble Component-Based Operating System, Usenix’99

[3] Gabriel Parmer, Composite: A component-based operating system for predictable and dependable computing, PhD thesis, Boston University, 2009

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://os.inf.tu-dresden.de/pipermail/l4-hackers/attachments/20140215/44ecc7af/attachment.html>