Capability Authentication

Marcus Brinkmann marcus.brinkmann at
Mon Oct 17 22:09:29 CEST 2005

At Mon, 17 Oct 2005 12:55:09 +0200,
Marcus Völp <voelp at> wrote:
> As I said, the fact that we did not include a cmp operation yet is 
> simply because we were not sure to need it. Once you convince us cmp is 
> in, provided it opens no further security leaks.

Unfortunately by now I am almost convinced that even introducing cmp()
is insufficient, mostly because of performance and code complexity
issues.  So I may not be the best person to argue in favor of cmp().

> >BTW, the protocol you describe doesn't even work, as in L4.sec there
> >is no way for "C" to name "S" in its request to "D".  Introducing such
> >names, which would have to be authenticated, is impossible if C and S
> >have a certain asymmetric trust relationship.  However, there _are_
> >protocols which work, and I have worked them out (you can ask me for
> >more details).  They all require RPCs in which capabilities are passed
> >as arguments and can be identified by the receiver, for example using
> >a cmp operation.
> >  
> >
> It would be great if you could provide us these protocols. Actually that 
> is what we are currently searching for to determine whether the proposed 
> architecture will work out.

You can start with my presentation at the LSM in Dijon.  I make no
apologies for its deficiencies: I am sure there is a lot in there that
can embarrass me, so please apply good will liberally :)

You can find slides and an audio recording here:

Porting the GNU/Hurd to the L4 microkernel, by Marcus Brinkmann
Abstract [en] - Abstract [fr] - Slides [pdf] - Audio recording [ogg]

I am making a lot of assumptions in this work, many of which may not
apply to yours.  I am happy to further elaborate on it.  This is
probably best done in answering specific questions.
> >Also, this proposal does not address the general case where an
> >operation must be performed on more than one capability, like server
> >side copy operations.
> >  
> >
> I know this proposal is not complete and we can find various scenarios 
> where it seems not to work. With regard to server side copy we are not 
> quite sure yet whether we need such an operation. But this again assumes 
> some system structure which performs the copy in a server that can 
> savely invoke capabilities.

Certainly we are making a lot of assumptions.  After all, we only want
to implement one system, and not ten.  So we make assumptions that
seem appropriate to us for the type of system we want to build.  I am
happy to explain, and to some extend defend, the assumptions we make,
should you find them suspicious.

Note that the copy operation was only an example.  We use such an
operation in our physical memory server design that implements trusted
buffer objects, so we can make logical copies of page frames.  I
really can't imagine any other way to do this securely but by invoking
one operation on two objects implemented by the same server.

The talk above contains another example where an operation needs to be
invoked on two objects: The authentication server in our system
implements ACL based authentication.  The client provides an
identifying capability to the server, and the server makes a call to
its trusted authentication server handle, providing the clients
capability and a capability that should be returned to the client
(again, an operation that needs to be invoked on two capabilities at
the same time).  Note that the server can impose as the client to
another server, but this is not a problem as the object is not
directly returned, but given to the trusted auth server, where (only)
the client can retrieve it.

> >>ad 2:
> >>Alternatively the server can prepare to defend against misbehavior of D. 
> >>In L4.Sec the receiver of an IPC controls the location where an incoming 
> >>message is placed. Thus it can select an area of its address space so 
> >>that even if D replies with bogus content, S is not harmed. It remains, 
> >>however, the problem of blocking S. An easy way to defend against 
> >>blocking attacks is to fork of a thread for this particular client's 
> >>request and let it invoke the pot. untrusted capability on behalf of the 
> >>client. Other thread invoking the same server are not affected by this 
> >>blocking.
> >>    
> >>
> >
> >This proposal achieves nothing.  First of all, there is no benchmark
> >that tells you when you have waited long enough and the capability
> >should be rejected, for example because the destination blocked too
> >long.  But even more seriously, there is no benchmark to decide when
> >the capability is good and its implementation be trusted.  Thus, this
> >proposal does not achieve the original goal at all.
> >
> >It's nice that I can create a new thread to restrict the damage caused
> >by a blocking RPC partner.  But it has nothing to do with what you
> >want to achieve in Jonathans example.
> >  
> >
> I don't agree here. The proposal is to be able to decide on the trust of 
> a capability after invoking it and avoiding the damage when it turned 
> out not to be trusted. This has nothing to do with blocking time but 
> more with a server structure that acts for and on behalf of a client.
> Jonathan's example (I assume you refer to the example where the server 
> has to decide whether or not to leak secure data) works by requesting 
> the not yet trusted target of the capability to authenticate itself 
> (e.g., via the kernel protected badge send with the reply IPC) and
> to leak information only after the knowing that the server is trusted.

I am not sure this is about leaking secure data, and I can't really
follow your description of what the server does.  Maybe this is just
because I don't understand the terms you used in the way you use them.
Maybe it helps if I say in my words what I think Jonathan's example
achieves and how it does it.

The "sender" sends a capability to the "receiver".  Now the objective
for the receiver is to establish if it can "trust" this capability.
Here this means: The receiver wants to be able to trust the
_implementation_ of the capability.  Ie, it wants to make sure that
this capability is really implemented by the exact trusted system
service it wants it to be implemented by.

For this purpose, the receiver can invoke a capability to the trusted
system server which it already holds (for example, which it was given
at process creation time), passing the "unknown" capability as an
argument.  _If_ the trusted system server has created the capability,
it can now identify it via a kernel protected badge, and send a
positive or negative reply in the response message.

Note that _not_ the "not yet trusted target" is asked to authenticate
itself.  Rather, an already trusted target is asked to identify the
not yet trusted target by means of the identify operation, which is
invoked on the server side.

Such an identify operation is also mentioned as an example in my talk.
In fact, it is the basis for the reference monitor as well as the
authentication server.

Note however that in my talk I often shortcut the process: The
following two are analogous:

1. You first identify the untrusted capability as a trusted capability
   using a known trusted capability, then you perform an operation on

   This is how it is done in EROS.

2. You combine both operations into a single one that is invoked on
   the known trusted capability and takes the untrusted capability as
   an argument.

   This is how it is done in the Hurd (so far).

Here maybe you will find something reasonable: In the Hurd design we
did, we never actually "use" the previously untrusted capability
directly.  We only use it as an authentication handle.  This is merely
a syntactical difference, though.  The design pattern is exactly the

> >1. Is the design pattern actually desirable?
> >
> I don't know yet. We did not yet design / build a system of a scale 
> which would allow us to answer this question. We are right now looking 
> into such a system and we would welcome any contributions and a detailed 
> understanding of your envisaged system.

Fair enough.  I developed the protocols from scratch (in cooperation
with Neal Walfield), based on what we wanted to do in the Hurd.  On
this basis alone I would have little confidence that we are on the
right track.  But in hindsight I found similar design patterns in
Shapiros work (EROS), which has a much better analytical foundation
than our work.  I submit the emergence of the same design pattern in
two independent systems as one piece of evidence for its usefulness :)

> >2. If yes, can it be _efficiently_ supported by the L4 architecture
> >   with a reasonable effort?
> >
> Since the answer to the first question is a don't know we should not 
> talk about efficiency yet.

From that perspective, I agree.

> If a system structure cannot efficiently be 
> supported by L4 we need to understand why and if it turns out to be the 
> kernel which causes the inefficiency we will change it. Do you have any 
> numbers for your architecture where you find L4 to be the bottleneck 
> (assuming the functions you need were added).

No, because we halted the implementation for a reevaluation of the
design decisions we made.  The major reason we believe that
performance will be inappropriate is at this point not primarily the
cmp operation, but it is the lack of a copy operation for mappings.
If you look at my protocols, this imposes an additional IPCs and
system calls in the RPC path for every capability that should be
copied from one process to another.  As capability copy is expected to
be ubiquituous, this is a discouraging result.

If I would assume that a copy operation is implemented, and a cmp
operation is available, then that might be an opportunity to have
another look.  However, from a purely aesthetical design-intuition
point of view, cmp() looks like a band-aid to me.  There are two
concerns and then a bit: To actually use cmp() efficiently, the client
needs to supply a guess as to for which object it has supplied the
capability.  Otherwise the server can't find it in O(1).  Not a big
deal, but its cumbersome and increases the message size, and makes
multiplexing more complex.  Also, Espen argued convincingly that you
really want to have the translation happen on the IPC path, and not
after receiving the message, to reduce system call overhead.  Because
of this I found the ID objects Espen proposed a more convincing
approach.  However, the devil was in the detail, and there were a
couple of important optimizations and small feature extensions
required to make it all fit together (for example for efficient
"return capabilities" we would want a new map item which would
auto-destruct on the next mapping---things like that).

To accept any of this however, you need to come a certain way along
with us, accepting some of our assumptions, at least preliminary.  In
return, I would easgerly listen to alternative proposals that don't
give up on the general goal (a protected capability system that allows
to implement secure operating systems).

It's some months since I looked at all this the last time, but it's
all slowly coming back to me as if it was yesterday.  Now I'll shut up
and wait for your questions. :)


More information about the l4-hackers mailing list