This is a response to several messages (from Volkmar, Rudy, Hermann) at once. The delay has partly been due to other demands on my time, and partly because I wanted to consider how to answer.
First, let me make sure that we are debating the same issue by giving it a precise description.
Currently, L4 invocations invoke:
thread-id
There is a proposal for thread address spaces. Under this proposal, the invocation argument becomes an *index* (equivalently: an address) for a thread-id. I will write this as:
[thread-id]
Note that once the indexing mechanism is in place, the no longer has access to the thread-id's per se. Thus, semantically, the ID bits no longer name a thread from the application perspective -- this is strictly a detail of implementation. From a semantics perspective, it is clearer to rewrite this as:
[server-id]
This leaves us the freedom to change later how "server-id" is demultiplexed, e.g. in order to have a default demultiplexing policy for multithreaded services if one were ever desired.
Today, when an L4 client wishes to invoke an object, it performs an IPC of the form:
IPC : [server-id], object-id, { args ...} => [caller-id], principal-id, object-id, { args... }
our debate is whether we should consider adding a server-controlled ID field into the descriptor. To avoid confusion, I will call this new ID the "if-id" (for "interface-id"). This would revise the invocation above into:
IPC : [server-id, if-id], object-id, {args...} => [caller-id], principal-id, if-id, object-id, {args ...}
If this characterization does NOT capture the discussion, please read no further and let us first agree on what the question is. The balance of this note ASSUMES that this is a correct characterization of the question.
Separately, I am proposing that the revealed principal-id should be set in software by the thread manager, and should NOT be simply the sender thread-id. Current behavior can be maintained by setting principal-id=thread-id. EROS behavior requires setting principal-id to some fixed value shared by all threads.
I should emphasize that the term "interface-id" is quite misleading. Just as the interpretation of the object-id bits lies completely in the discretion of the server, so does the interpretation of the interface-id bits.
The critical difference is that the interface-id bits are guarded by the kernel on behalf of the service. The service therefore can rely on the fact that these bits have not been tampered with by the client, and can (depending on the interpretation assigned to these bits) omit any check of their security.
Volkmar has replied:
Allowing extension doesn't bring you any benefit. Transparency can be implemented in a user-level library (without any overhead). How I understood your description is that you cache information about what user object types you have in the kernel.
There *is* a clear advantage: these bits are guarded by the kernel, which eliminates the need for extra checks or awkward transfer protocols.
That costs you another check on the critical path.
From the description above, it should be clear that there is NO
additional check on the critical path. I suspect Volkmar is thinking of the capability type field, which is a completely separate issue, and one that I agree we should try to avoid.
Hermann has replied:
L4 does *not* (today) provide means to allow a server to extend the object name space.
But is allows servers to build arbitrary name spaces on top of L4. It is not kernel business to provide a name space for user-land objects. Name spaces are often defined by user-level standards (e.g., file ids).
I think that there is a second misunderstanding here. Nothing in my proposal alters this at all. The bits stored in the kernel are not interpreted by the kernel in any way. Therefore, the name space that they represent remains a user-land name space. They are merely *carried* by the kernel, protected on behalf of the server. You can think of them as a small piece of secure storage.
The problem is that in the *absence* of this secure storage, it is necessary to introduce complex multi-party protocols at user level in order to support descriptors correctly. L4 has embedded a policy that descriptor architectures should be penalized. Given the presence of this policy, no claim can be sustained that L4 is policy-neutral.
MOTIVATION:
The motivation for this feature is the need to be able to implement an access control model that is decidable and potentially correct. L4 today fundamentally does not support this efficiently. What Volkmar may not know is that this is also the ONLY reason that EROS was not built on L4 years ago.
Some time around 1995 or 1996, Jochen came to visit me at the University of Pennsylvania to explore several topics, among them moving EROS to L4. At the time, there seemed to be many impediments, but Leendert van Dorn and I would later resolve most of them in the paper design for the Obsidian kernel. The one matter that Leendert and I could NOT resolve was the absence of the interface-id bits and the (then) need to transition from "thread-id" to "[thread-id]". At the time, Trent had not yet started his work on IPC indirection.
As Volkmar says, UNIX fork() performance sucks, and the interface-id issue may not help -- there are already many IPCs that need to be done for UNIX fork(), and the extra ones needed to validate/cache the user-supplied object-id are not significant from a performance perspective.
However, Jeff is also right that I am describing lamda binding. This is fundamentally powerful, and Jeff is right that it is very useful in eliminating some important programming errors. The interface-id additionally improves end to end performance in a number of significant situations -- most notably checking of descriptor protection bits (e.g. read-only).
The EROS problem in particular is that descriptor copy is not an occasional thing. It is *ubiquitous*. Ever CALL/RETURN pair that we do transfers at least one descriptor, and our entire design rests on being able to examine the interface-id. We absolutely CANNOT replace this with a multi-IPC sequence that relies on some third party to validate a user-supplied argument.
Further, it is UNACCEPTABLE in the EROS design to perform ANY checking based on the sender-id. Indeed, if we were to re-implement EROS on top of L4, we would be forced to set the revealed sender-id to zero in all cases.
Ultimately, the L4 design has a deeply embedded assumption about access control: that access control should be performed based on subject ID. That is, it is an ACL design. ACLs have been formally proven to be a broken model for access control. I am advocating that L4 needs to adopt a change that will admit the possibility of implementing at least one access control model that is formally decidable and correct: capabilities.
I am trying to be very careful NOT to propose a change that will violate any of the current L4 programming model (at least, no more than a recompilation).
OTHER
[Volkmar:]
So you provide an in-kernel cache for some identifiers (call it bits) which are unforgeable. How much of your register real estate do you give up for that? What when the size is exceeded?
Register real-estate: I believe none. It is simply an additional word to be copied within the descriptor map/grant path.
When the size is exceeded, EROS falls back to a nasty hack that lets us extend this field to 48 bits. We have never found an application where 48 bits was insufficient. Beyond that, we would start using multiple, distinguished threads so that we could leverage the thread-id for additional bits.
If I had it to do again I would probably simply define this part of our descriptor to be 48 or 64 bits long. The need for the nasty trick is truly ugly.
[Volkmar:]
Because these are architecturally insufficient to implement an efficient, secure, object-based operating system.
Hmm, actually that is a question of what you try to implement. What if you don't want an object-based OS? Do you incur a significant overhead with your model? I'm curious how a Linux kernel would perform on top of EROS--I could imagine that your security model has a measurable overhead.
Now that the proposal has been articulated more clearly, are you still concerned about this? It is very difficult for me to imagine that adding 64 bits (max) to the IPC protocol payload would actually matter.
It certainly creates register pressure on the x86, but you might wish to have a look at:
http://www.eros-os.org/pipermail/eros-arch/2003-December/004249.html
We have decided that register-optimized transfer is probably a bad idea. Moving to a mapped page scheme essentially eliminates the register pressure, and probably simplifies the IDL code enough that it improves end to end invocation time.
[Volkmar:]
Our experience has been that relying on such clients to specify the intended operation is not robust. The flow of permissions in complex programs is not well localized, and it is very easy to write a subroutine designed for one purpose that does some mildly dangerous thing and then call it (by programmer error) in the middle of some sequence of code where care is required.
Tying permissions to the object descriptor does not prevent the programmer from passing the wrong descriptor, but it does help a great deal in localizing the scope of programmer attention that is required to resolve these problems.
This sounds like you are suggesting kernel design based on bad programming habits. Are you willing to pay the overhead? We don't.
One of Jochen's beliefs was that performance is more important then any other consideration. He passed this strong belief on to his students. In my opinion he was deeply wrong about this belief.
There are many kinds of overhead:
1. The difficulty of writing good programs using bad APIs is an overhead.
2. The fact that the resulting systems are demonstrably unsecurable, and that many of the most common problems can be traced to (1) is an overhead.
3. Performance cost is certainly an overhead.
.. and of course, lots of others
I believe that the correct overhead to optimize is the end to end runtime cost of a system measured in dollars, not cycles.
With that as preamble, let me answer your question:
If, at the performance cost of one or two additionally transferred words, we provide a foundation that can eliminate millions of dollars of daily security flaws, then I submit that this was a very good engineering decision, and yes, I think the "overhead" is justified.
If, at the performance cost of one or two additionally transferred words in the kernel we can eliminate complex validation code at user level in a significant number of cases, then yes, I believe that the "overhead" is good engineering -- in this case, even if it merely "breaks even".
From a research perspective, if at the performance cost of one or two
additionally transferred words in the kernel we create a platform that facilitates a much broader space of research operating systems, then UNQUESTIONABLY I believe that the "overhead" is justified.
And realistically, taking into account the cache line effects that will arise, we are probably not talking about more than one or two cycles. Given superscalar execution and the nature of the copy control loop, we may be talking about ZERO.
And then two answers that are much more subjective:
When one discards 30 years of experience with insecure code without serious consideration, one is engaging in idiology rather than engineering, and our proper business is engineering. Let us try to avoid idiology on all sides of this discussion.
It is not bad programming practice to follow the most natural path that is dictated by a given interface. It is *inevitable* programming practice, and the fault, if any, must rest entirely with the designer of the interface. Your value judgment is that it is good engineering to require millions of programmers to write complex code so that ten system architects can save a small number of cycles. This is absurd, and it ignores every piece of empirical evidence about human behavior that we have. As a group, humans will seek to behave in the way that give greatest short-term benefit for least energy. Any other expectation is wishful thinking. Therefore, the behavior that you label "bad programming practice" intrinsically justifies labeling the interface a "bad interface design".
Most system designers lack the capacity to engineer in a way that accounts for this, but it is one of the marks of a good system designer that they do so successfully more often than not.
CLOSING
If the L4 community eventually feels that this is not a reasonable change, that is okay. However, it is absolutely impossible for a system like EROS to be efficiently implemented on L4 without it. This means that if the decision is negative, we are also deciding not to merge the communities.
This is also okay, but we should clearly understand what is at stake in the discussion.
shap