RE: Comparing IPC and capability invocation

21 Dec 2003

      This is a response to several messages (from Volkmar, Rudy, Hermann) at
once. The delay has partly been due to other demands on my time, and
partly because I wanted to consider how to answer.

First, let me make sure that we are debating the same issue by giving it
a precise description.

Currently, L4 invocations invoke:

	thread-id

There is a proposal for thread address spaces. Under this proposal, the
invocation argument becomes an *index* (equivalently: an address) for a
thread-id. I will write this as:

	[thread-id]

Note that once the indexing mechanism is in place, the no longer has
access to the thread-id's per se. Thus, semantically, the ID bits no
longer name a thread from the application perspective -- this is
strictly a detail of implementation. From a semantics perspective, it is
clearer to rewrite this as:

	[server-id]

This leaves us the freedom to change later how "server-id" is
demultiplexed, e.g. in order to have a default demultiplexing policy for
multithreaded services if one were ever desired.

Today, when an L4 client wishes to invoke an object, it performs an IPC
of the form:

	IPC : [server-id], object-id, { args ...}
	    => [caller-id], principal-id, object-id, { args... }

our debate is whether we should consider adding a server-controlled ID
field into the descriptor. To avoid confusion, I will call this new ID
the "if-id" (for "interface-id"). This would revise the invocation above
into:

	IPC : [server-id, if-id], object-id, {args...}
	    => [caller-id], principal-id, if-id, object-id, {args ...}

If this characterization does NOT capture the discussion, please read no
further and let us first agree on what the question is. The balance of
this note ASSUMES that this is a correct characterization of the
question.

Separately, I am proposing that the revealed principal-id should be set
in software by the thread manager, and should NOT be simply the sender
thread-id. Current behavior can be maintained by setting
principal-id=thread-id. EROS behavior requires setting principal-id to
some fixed value shared by all threads.

I should emphasize that the term "interface-id" is quite misleading.
Just as the interpretation of the object-id bits lies completely in the
discretion of the server, so does the interpretation of the interface-id
bits.

The critical difference is that the interface-id bits are guarded by the
kernel on behalf of the service. The service therefore can rely on the
fact that these bits have not been tampered with by the client, and can
(depending on the interpretation assigned to these bits) omit any check
of their security.

Volkmar has replied:
...
Allowing
extension doesn't bring you any benefit.  Transparency can be
implemented in a user-level library (without any overhead).  How I
understood your description is that you cache information about what
user object types you have in the kernel.
There *is* a clear advantage: these bits are guarded by the kernel,
which eliminates the need for extra checks or awkward transfer
protocols.
...
That costs you another check on the critical path.
...
From the description above, it should be clear that there is NO
additional check on the critical path.  I suspect Volkmar is thinking of
the capability type field, which is a completely separate issue, and one
that I agree we should try to avoid.
Hermann has replied:
...
...
L4 does *not* (today) provide means to allow a server to extend the
object name space.
But is allows servers to build arbitrary name spaces on top of L4. It is 
not kernel business to provide a name space for user-land objects. Name 
spaces are often defined by user-level standards (e.g., file ids).
I think that there is a second misunderstanding here. Nothing in my
proposal alters this at all. The bits stored in the kernel are not
interpreted by the kernel in any way. Therefore, the name space that
they represent remains a user-land name space. They are merely *carried*
by the kernel, protected on behalf of the server. You can think of them
as a small piece of secure storage.

The problem is that in the *absence* of this secure storage, it is
necessary to introduce complex multi-party protocols at user level in
order to support descriptors correctly. L4 has embedded a policy that
descriptor architectures should be penalized. Given the presence of this
policy, no claim can be sustained that L4 is policy-neutral.

MOTIVATION:

The motivation for this feature is the need to be able to implement an
access control model that is decidable and potentially correct. L4 today
fundamentally does not support this efficiently. What Volkmar may not
know is that this is also the ONLY reason that EROS was not built on L4
years ago.

Some time around 1995 or 1996, Jochen came to visit me at the University
of Pennsylvania to explore several topics, among them moving EROS to L4.
At the time, there seemed to be many impediments, but Leendert van Dorn
and I would later resolve most of them in the paper design for the
Obsidian kernel. The one matter that Leendert and I could NOT resolve
was the absence of the interface-id bits and the (then) need to
transition from "thread-id" to "[thread-id]". At the time, Trent had not
yet started his work on IPC indirection.

As Volkmar says, UNIX fork() performance sucks, and the interface-id
issue may not help -- there are already many IPCs that need to be done
for UNIX fork(), and the extra ones needed to validate/cache the
user-supplied object-id are not significant from a performance
perspective.

However, Jeff is also right that I am describing lamda binding. This is
fundamentally powerful, and Jeff is right that it is very useful in
eliminating some important programming errors. The interface-id
additionally improves end to end performance in a number of significant
situations -- most notably checking of descriptor protection bits (e.g.
read-only).

The EROS problem in particular is that descriptor copy is not an
occasional thing. It is *ubiquitous*. Ever CALL/RETURN pair that we do
transfers at least one descriptor, and our entire design rests on being
able to examine the interface-id. We absolutely CANNOT replace this with
a multi-IPC sequence that relies on some third party to validate a
user-supplied argument.

Further, it is UNACCEPTABLE in the EROS design to perform ANY checking
based on the sender-id. Indeed, if we were to re-implement EROS on top
of L4, we would be forced to set the revealed sender-id to zero in all
cases.

Ultimately, the L4 design has a deeply embedded assumption about access
control: that access control should be performed based on subject ID.
That is, it is an ACL design. ACLs have been formally proven to be a
broken model for access control. I am advocating that L4 needs to adopt
a change that will admit the possibility of implementing at least one
access control model that is formally decidable and correct:
capabilities.

I am trying to be very careful NOT to propose a change that will violate
any of the current L4 programming model (at least, no more than a
recompilation).

OTHER

[Volkmar:]
...
So you provide an in-kernel cache for some identifiers (call it bits)
which are unforgeable.  How much of your register real estate do you
give up for that?  What when the size is exceeded?
Register real-estate: I believe none. It is simply an additional word to
be copied within the descriptor map/grant path.

When the size is exceeded, EROS falls back to a nasty hack that lets us
extend this field to 48 bits. We have never found an application where
48 bits was insufficient. Beyond that, we would start using multiple,
distinguished threads so that we could leverage the thread-id for
additional bits.

If I had it to do again I would probably simply define this part of our
descriptor to be 48 or 64 bits long. The need for the nasty trick is
truly ugly.

[Volkmar:]
...
...
Because these are architecturally insufficient to implement an
efficient, secure, object-based operating system.
Hmm, actually that is a question of what you try to implement.  What if
you don't want an object-based OS?  Do you incur a significant overhead
with your model?  I'm curious how a Linux kernel would perform on top of
EROS--I could imagine that your security model has a measurable
overhead.
Now that the proposal has been articulated more clearly, are you still
concerned about this? It is very difficult for me to imagine that adding
64 bits (max) to the IPC protocol payload would actually matter.

It certainly creates register pressure on the x86, but you might wish to
have a look at:

   http://www.eros-os.org/pipermail/eros-arch/2003-December/004249.html

We have decided that register-optimized transfer is probably a bad idea.
Moving to a mapped page scheme essentially eliminates the register
pressure, and probably simplifies the IDL code enough that it improves
end to end invocation time.

[Volkmar:]
...
...
Our experience has been that relying on such clients to specify the
intended operation is not robust. The flow of permissions in complex
programs is not well localized, and it is very easy to write a
subroutine designed for one purpose that does some mildly dangerous
thing and then call it (by programmer error) in the middle of some
sequence of code where care is required.
Tying permissions to the object descriptor does not prevent the
programmer from passing the wrong descriptor, but it does help a great
deal in localizing the scope of programmer attention that is 
required to resolve these problems.
This sounds like you are suggesting kernel design based on bad
programming habits.  Are you willing to pay the overhead?  We don't.
One of Jochen's beliefs was that performance is more important then any
other consideration. He passed this strong belief on to his students. In
my opinion he was deeply wrong about this belief.

There are many kinds of overhead:

  1. The difficulty of writing good programs using bad APIs is
     an overhead.

  2. The fact that the resulting systems are demonstrably
     unsecurable, and that many of the most common problems can be
     traced to (1) is an overhead.

  3. Performance cost is certainly an overhead.

   .. and of course, lots of others

I believe that the correct overhead to optimize is the end to end
runtime cost of a system measured in dollars, not cycles.

With that as preamble, let me answer your question:

If, at the performance cost of one or two additionally transferred
words, we provide a foundation that can eliminate millions of dollars of
daily security flaws, then I submit that this was a very good
engineering decision, and yes, I think the "overhead" is justified.

If, at the performance cost of one or two additionally transferred words
in the kernel we can eliminate complex validation code at user level in
a significant number of cases, then yes, I believe that the "overhead"
is good engineering -- in this case, even if it merely "breaks even".
...
From a research perspective, if at the performance cost of one or two
additionally transferred words in the kernel we create a platform that
facilitates a much broader space of research operating systems, then
UNQUESTIONABLY I believe that the "overhead" is justified.
And realistically, taking into account the cache line effects that will
arise, we are probably not talking about more than one or two cycles.
Given superscalar execution and the nature of the copy control loop, we
may be talking about ZERO.

And then two answers that are much more subjective:

When one discards 30 years of experience with insecure code without
serious consideration, one is engaging in idiology rather than
engineering, and our proper business is engineering. Let us try to avoid
idiology on all sides of this discussion.

It is not bad programming practice to follow the most natural path that
is dictated by a given interface. It is *inevitable* programming
practice, and the fault, if any, must rest entirely with the designer of
the interface. Your value judgment is that it is good engineering to
require millions of programmers to write complex code so that ten system
architects can save a small number of cycles. This is absurd, and it
ignores every piece of empirical evidence about human behavior that we
have. As a group, humans will seek to behave in the way that give
greatest short-term benefit for least energy. Any other expectation is
wishful thinking. Therefore, the behavior that you label "bad
programming practice" intrinsically justifies labeling the interface a
"bad interface design".

Most system designers lack the capacity to engineer in a way that
accounts for this, but it is one of the marks of a good system designer
that they do so successfully more often than not.

CLOSING

If the L4 community eventually feels that this is not a reasonable
change, that is okay. However, it is absolutely impossible for a system
like EROS to be efficiently implemented on L4 without it. This means
that if the decision is negative, we are also deciding not to merge the
communities.

This is also okay, but we should clearly understand what is at stake in
the discussion.

shap

RE: Comparing IPC and capability invocation

Jonathan S. Shapiro