[Jonathan S Shapiro]
Based on your description, I am now reasonably convinced that the L4 operations are individually faster, but that the collective end to end protocol needed to resolve page faults when data spaces are involved may be significantly more complicated in L4 than it is in EROS. I suspect that the aggregate end to end costs in L4 are likely to be *slower* than EROS, but at best they are going to be very similar.
Let's try to summarize what needs to be done for resolving a page fault using the data space model.
1. Page fault is raised, execution traps into the kernel. 2. Kernel translates the page fault into a page fault IPC. 3. The kernel switches to the pager---the region mapper. The region mapper resides in the same address space, so no address space switch needed. 4. The region mapper translates the page fault into a region map access. 5. The region mapper sends a request to the corresponding data space manager. Note that the request is sent "deceiving" or "propegating", meaning that the address space manager can reply direcly to the faulting thread. 6. The data space manager checks if the request is valid and translates the request to a map operation (this translation can be implemented very efficiently). 7. The data space manager replies with a mapping to the faulting thread. 8. The faulting thread resumes execution.
As you point out below, it would be possible to associate a separate pager with different regions of virtual memory, but for reasons I argue below, this reduces flexibility.
Anyhow, by associating a pager with separate memory regions we can only avoid one (intra address space) IPC operation (step 3). I'm not convinced that this matters much in practice since page faults are generally treated by the hardware as exceptions and incurs a substantial overhead in the first place (pipeline flushing, various synchronization when updating page tables, change of cache working sets, etc.). The performance numbers of the "data spaces" paper I cited in the last mail substantiate these claims.
Or perhaps the L4 design embeds a philosophical argument that resolving these things at user level is (a) feasible and (b) likely as efficient than any kernel implementation, and therefore should not be done in the kernel? If so, I understand the philosophical point, and I am not sure that I agree. In my mind, the answer depends on what gets the job done best on an end to end basis.
Yes, this philosophical argument does apply to the design decisions of L4.
Please note that I'm not advocating placing policy in the kernel here. I'm wondering if there might be a better *mechanism* by which to express the user-desired policy.
We've found that the only mechanism that allows us to express the user-desired policy is to perform all these policy decisions on user-level. For instance, the region mapper might suddenly choose to delay all write accesses to a particular region map, e.g., to capture a consistent snapshot of the region. It can do this by revoking all write accesses to the region, and if anyone tries to perform a write operation it can delay the write operation until the snapshot has been taken.
On a more general level the argument seems to stem down to whether the kernel transparently handles exceptions or whether exceptions are exposed to the application and handled in an application specific way. Clearly, the L4 desgin favours the second approach. I realize that this approach may be unsatisfactory in certain situations, e.g., if one for some reason needs to make the application unaware of external communication. This is one of the gripes I have with the current versions of L4.
One solution to make exceptions transparent to applicaitons, while still allowing policies to be defined on user-level could be to disallow an application to change its pager/exception handler, thereby making sure that the exception IPCs can not be seen by the application. Such a scheme would not really work with the current L4 specification because an application is always allowed to change its pager/exception handler. Even if we could make this restriction about non-changeable pagers/exception handlers, the application could still be able to intercept the exception IPCs by timely probing the state of other threads in the address space and aborting ongoing exception IPC operations.
eSk