Comparing IPC and capability invocation

Jonathan S. Shapiro shap at
Thu Dec 11 18:57:05 CET 2003

On Mon, 2003-12-08 at 12:31, Hermann Härtig wrote:
> Jonathan S. Shapiro wrote:

> >>(I see no benefit though of restricting communication to "calls on 
> >>objects". User-level SW of course can restrict communication to such 
> >>patterns, but why should a micro-kernel enforce such restriction? 
> > 
> > 
> > It is not a restriction. It is a generalization. The behavior of L4 with
> > local names is a strict subset of the behavior of EROS start
> > capabilities. See below in my comments about compatibility.
> This I do not understand (-> note for your tutorial in Dresden).

By all means we should discuss, but let me attempt to clarify.

ALL communications are invocations on objects. The only questions that
exist in principle are:

   1. What restrictions are imposed on the TYPE of object that can be

   2. Is the object namespace extensible by user-mode code. That is,
      can user-mode servers present objects or interfaces that appear
      to the invoker to be "first class" in the same sense that kernel
      supported objects are first class.

L4 imposes the restriction that the only invocable object type is
"process" (or in some cases thread).

L4 does *not* (today) provide means to allow a server to extend the
object name space.

My first point is probably self-explanatory, so I will expand only on
the second.

In L4, if a client wishes to perform an operation on a file, the "name"
of the file must be passed as an argument to an IPC. The invocation is
something like:

   file_server->invoke(file-id, operation-id, ... other args ...)

Because "file" is not a kernel-supported object, the protocol mandates
that the sender provide an additional argument in the IPC invocation. In
the EROS philosophy, we would argue that these objects are therefore
"second class" and that this is bad for several reasons:

   1. The invoker should not know the server identity. That should
      be known only to the file object.
   2. It is difficult to transparently virtualize objects when their
      invocation patterns are different.

Next problem:

The server must then run some function:

	get_permissions(sender-id, file-id) -> permissions

to determine what operations are permitted. Note that if this operation
is performed faithfully and correctly, it is impossible to emulate
correctly the behavior of the UNIX I_SENDFD socket operation without
many additional calls to a shared service -- the design of the operation
makes descriptor transfer an inherently expensive operation.
Descriptors, which *can* be used as a foundation for certain kinds of
security, suddenly become extremely inefficient because they cannot be
passed without consulting a third party.

I do not argue that descriptors are the only form of security. I argue
(from experience) that they are one important kind, and that a good
microkernel design should not intrinsically penalize them.

In EROS, the corresponding invocation would be:

   file_capability->invoke(operation-id, ... other args ...)

That is, the fact that "file" is not a kernel-supported object does not.
The "file_capability" contains within it a pair of the form:

	(process-id, server-defined-bits)

The server-defined-bits portion cannot be examined by the client. It is
provided to the server during invocation. The server can interpret these
bits in any way desired: as an object id, as a facet id, as permission
bits, as some mix of these.

The presence of these bits does not preclude invocation of the server
qua server; the server merely assigns to itself an arbitrarily chosen
value of "server-defined-bits" to name the server itself.

> >>Dresden's version of an IDL compiler is designed to support arbitrary 
> >>message patterns. User-land "objects" should remain a user-land issue.)
> > 
> > 
> > I agree that user-land objects should largely remain a user-land issue.
> > The issue is that if the user-land server has no efficient means to
> > implement a protected name for a user-land object, significant power and
> > performance are lost.
> In our thinking, the unit of protection is an address space. Why need 
> anything else but unforgable sender ids plus control of IPC via send 
> mappings?

Because these are architecturally insufficient to implement an
efficient, secure, object-based operating system.

> >                             You have not yet proposed a response to
> > my question about how a single sender can simultaneously hold both a
> > logically read-only descriptor and a logically read-write descriptor if
> > there is no way for the server to efficiently differentiate the type of
> > the descriptor that was invoked.
> Two answers:
> 1) This is simple if we - as we discuss in Dresden - associate ids 
> ("Principal Id") with a send mapping (what you call a descriptor). The 
> kernel enforces that a message contains the "Principle ID" as the first 
> part of a message (instead of a sender id based on threads ids).  Then 
> you can associate one id with the "logically read-only descriptor" and 
> another with the "logically read-write descriptor".

I look forward to hearing more about tihs.

> 2) The scenario is not interesting. If a single sender holds "both a 
> logically read-only descriptor and a logically read-write descriptor" 
> there is no way to stop a sender to use the descriptor granting more 
> rights. The descriptor cannot base anything on the knowledge which of 
> the two descriptors has been used.

This is correct, but it is an insufficient understanding. If we imagine
that the sender is compromised, you are correct that the sender is in a
position to use the most powerful descriptor that it holds.

However, there exists clients that must manage many sources of authority
at once. Such a client *must* have means to explicitly designate which
authority it is using at each instant. For example, the client may
possess read-write permission to some file, but it may be unwilling to
use this while executing its current operation.

Our experience has been that relying on such clients to specify the
intended operation is not robust. The flow of permissions in complex
programs is not well localized, and it is very easy to write a
subroutine designed for one purpose that does some mildly dangerous
thing and then call it (by programmer error) in the middle of some
sequence of code where care is required.

Tying permissions to the object descriptor does not prevent the
programmer from passing the wrong descriptor, but it does help a great
deal in localizing the scope of programmer attention that is required to
resolve these problems.

> > Thus: I would (provisionally) prefer a microkernel in which there is an
> > 'extra word' in the "recipient descriptor" (the replacement for
> > thread-id), and in which the unique sender identity is set by the thread
> > pager (which I assume is trusted).
> The "extra word" is the proposed "Principal Id" in the send mapping. The 
> pager is trusted with regard to the id space given to it. For example, 
> if a pager has send mappings with Principal Id "2" it can map only 
> forward mappings that begin with "2". Especially it can restrict the 
> name space of the next pager "25".

This is a good start, but our experience is that hierarchical
restrictions of this form do not map well to the real problem space.

> Attention, Jonathan! You are currently exposed to several different 
> lines of ideas on how to overcome the current limitations of L4. This 
> must be very confusing ...

Thanks for the warning. Yes, it is sometimes confusing. I do understand
that many ideas are "in the air."


More information about the l4-hackers mailing list