On Fri, 2003-12-05 at 10:32, Marcus Völp wrote:
I think the EROS capability space and the L4 address spaces are the same: the difference is that the address space -- or better call it name space -- in L4 is partitioned per type: i.e. a name space exists for memory (the virtual address space), a second name space for threads is under discussion.
They are not quite the same for reasons I will explain shortly, but I agree with your point. L4 address spaces are typed.
In EROS, once we move to capability address spaces, we will end up with a rather odd situation. The *data* address space will be typed (leaves must be page capabilities), but the capability address space is not (any capability may appear).
Our experience has been that generalizing the invocation mechanism to all types of objects has been tremendously useful.
EROS has the take and grant_e function (_e to distinguish with the mapping grant: grant_m), L4 maps and grants capabilities and is able to directly revoke through unmap.
These operations have similar names, but very very different functions. grant_e is a capability transmission. l4_map establishes a shared region mapping.
However, I believe that these can be unified -- note on this shortly.
First, switching to descriptors should not significantly change IPC performance. A descriptor is probably a larger data structure than a thread id, but in both cases there is a load instruction at the end to place the target thread address into a register. This is true in both a thread-id design and a descriptor design. The difference would be only that the thread structure pointer might live at a different offset.
Global IDs have the benefit to directly translate to the TCB address (+- some masking). Local IDs bear a potential for a TLB and a Cache miss, thereby the smaller the descriptor the more descriptors fit into a cacheline and performance might be better with smaller descriptors. Note, this is just a guess and we have to evaluate this in particular for SMP systems where false sharing might effect the performance negatively.
It is certainly true that local ids will impose a small performance penalty. The size of the leaf, in our experience, doesn't matter -- that cache line won't stay resident anyway. Speaking for myself, I would worry primarily that the descriptor should be cache aligned and that it should not exceed the cache line length.
I'm not sure the local id penalty is as high as you think -- we have some tricks in the EROS implementation that might apply.
However, global ids fail to provide protection in the absence of chiefs. If protection is important, the fair comparison is between chiefs and local ids, not between global ids and local ids. I believe that on balance, the indirection overheads are a reasonable price to pay to avoid the problems of the "receiver checks validity" approach. The problems I am thinking about are:
1. Hostile sender can flood the receiver. 2. Receiver permission checking in user mode is more expensive than one indirection.
I am not aware of any protected id design that avoids indirection entirely.
Ultimately, I believe that the right metric for IPC is not the trap to trap latency, but the latency between the client-side call and the server-side call (the serving procedure). Programs aren't trying to do IPC. They are trying to invoke services. It is the end to end cost that should be minimized.
In EROS, a client invokes an object. This object may be an entire server (a process) or it may be a particular object that is served by that server. In L4, the "particular object" case is customarily handled by passing an object id as an argument to the IPC operation. One problem with this is that the object id can be forged by the client.
The question here is whether we have to support unforgeable object IDs in the kernel or whether unforgeable sender IDs as we have now are sufficient. Given an unforgeable sender ID, the invoked object ID can be validated to be accessible by the sender.
So what you are saying is:
By eliminating a one word copy in the kernel, we can make it possible for the server to do a multi-instruction, multi-TLB-miss, multi-cache-line, multi-branch-stall lookup and validation at user level.
Whether this is a good answer will depend very much on the typical usage pattern. If most servers serve (a) multiple objects or (b) multiple interfaces, then the extra word in the kernel is justified. In our experience, almost *every* server serves multiple interfaces, and would therefore be forced to do validation. It is difficult to know whether this is a case of something that is fundamentally common, or whether our system design is simply optimized for our own unusual situation.
Transmitting sender identity is, in my view, potentially problematic. I need to think more about sender identities, but I believe that interface identities are tremendously important and are absolutely necessary for some of the higher-level systems that I want to build -- even if sender identities are transmitted.
How is the object invoker identified in EROS? If I understand EROS right, you send a resume capability with the object invocation and have a compare and get_type operation on the capabilities. Can you make use of this capability to reliably distinguish calling clients?
In EROS, we have never attempted (and never needed) to distinguish calling clients. Instead, we use capabilities to accomplish the same thing. A client receives a capability that is a
(thread-id, server-defined-field)
pair. The "server-defined-field" is unforgeable because it is kernel protected, and it is included in the outbound message. The field can be used for an object id, for an interface id (e.g. read-only vs. read-write), or some mixture.
It may be necessary to dispatch on sender ID, but in a capability system this would violate sender encapsulation.
Whether or not you are able to dispatch on sender id, doing so is insufficient to deal adequately with permissions. We (the EROS project) frequently need to deal with a situation where the same client holds simultaneously a read-write and a read-only capability to an object implemented by some server. Permissions are determined by which descriptor is invoked. I don't believe that dispatching on sender ID has sufficient power to discriminate this case.
Note that UNIX behaves similarly: a process can simultaneously hold a R/O and a R/W file descriptor to the same file.
Resume capabilities have nothing to do with any of this. They merely provide the server with transient authority to reply. The transience, by the way, is actually important from a security perspective.
shap