Re: Comparing IPC and capability invocation

11 Dec 2003

      On Mon, 2003-12-08 at 12:31, Hermann Härtig wrote:
...
Jonathan S. Shapiro wrote:
...
...
...
(I see no benefit though of restricting communication to "calls on 
objects". User-level SW of course can restrict communication to such 
patterns, but why should a micro-kernel enforce such restriction?
It is not a restriction. It is a generalization. The behavior of L4 with
local names is a strict subset of the behavior of EROS start
capabilities. See below in my comments about compatibility.
This I do not understand (-> note for your tutorial in Dresden).
By all means we should discuss, but let me attempt to clarify.

ALL communications are invocations on objects. The only questions that
exist in principle are:

   1. What restrictions are imposed on the TYPE of object that can be
      invoked?

   2. Is the object namespace extensible by user-mode code. That is,
      can user-mode servers present objects or interfaces that appear
      to the invoker to be "first class" in the same sense that kernel
      supported objects are first class.

L4 imposes the restriction that the only invocable object type is
"process" (or in some cases thread).

L4 does *not* (today) provide means to allow a server to extend the
object name space.

My first point is probably self-explanatory, so I will expand only on
the second.

In L4, if a client wishes to perform an operation on a file, the "name"
of the file must be passed as an argument to an IPC. The invocation is
something like:

   file_server->invoke(file-id, operation-id, ... other args ...)

Because "file" is not a kernel-supported object, the protocol mandates
that the sender provide an additional argument in the IPC invocation. In
the EROS philosophy, we would argue that these objects are therefore
"second class" and that this is bad for several reasons:

   1. The invoker should not know the server identity. That should
      be known only to the file object.
   2. It is difficult to transparently virtualize objects when their
      invocation patterns are different.

Next problem:

The server must then run some function:

	get_permissions(sender-id, file-id) -> permissions

to determine what operations are permitted. Note that if this operation
is performed faithfully and correctly, it is impossible to emulate
correctly the behavior of the UNIX I_SENDFD socket operation without
many additional calls to a shared service -- the design of the operation
makes descriptor transfer an inherently expensive operation.
Descriptors, which *can* be used as a foundation for certain kinds of
security, suddenly become extremely inefficient because they cannot be
passed without consulting a third party.

I do not argue that descriptors are the only form of security. I argue
(from experience) that they are one important kind, and that a good
microkernel design should not intrinsically penalize them.

In EROS, the corresponding invocation would be:

   file_capability->invoke(operation-id, ... other args ...)

That is, the fact that "file" is not a kernel-supported object does not.
The "file_capability" contains within it a pair of the form:

	(process-id, server-defined-bits)

The server-defined-bits portion cannot be examined by the client. It is
provided to the server during invocation. The server can interpret these
bits in any way desired: as an object id, as a facet id, as permission
bits, as some mix of these.

The presence of these bits does not preclude invocation of the server
qua server; the server merely assigns to itself an arbitrarily chosen
value of "server-defined-bits" to name the server itself.
...
...
...
Dresden's version of an IDL compiler is designed to support arbitrary 
message patterns. User-land "objects" should remain a user-land issue.)
I agree that user-land objects should largely remain a user-land issue.
The issue is that if the user-land server has no efficient means to
implement a protected name for a user-land object, significant power and
performance are lost.
In our thinking, the unit of protection is an address space. Why need 
anything else but unforgable sender ids plus control of IPC via send 
mappings?
Because these are architecturally insufficient to implement an
efficient, secure, object-based operating system.
...
...
You have not yet proposed a response to
my question about how a single sender can simultaneously hold both a
logically read-only descriptor and a logically read-write descriptor if
there is no way for the server to efficiently differentiate the type of
the descriptor that was invoked.
Two answers:
1) This is simple if we - as we discuss in Dresden - associate ids 
("Principal Id") with a send mapping (what you call a descriptor). The 
kernel enforces that a message contains the "Principle ID" as the first 
part of a message (instead of a sender id based on threads ids).  Then 
you can associate one id with the "logically read-only descriptor" and 
another with the "logically read-write descriptor".
I look forward to hearing more about tihs.
...
2) The scenario is not interesting. If a single sender holds "both a 
logically read-only descriptor and a logically read-write descriptor" 
there is no way to stop a sender to use the descriptor granting more 
rights. The descriptor cannot base anything on the knowledge which of 
the two descriptors has been used.
This is correct, but it is an insufficient understanding. If we imagine
that the sender is compromised, you are correct that the sender is in a
position to use the most powerful descriptor that it holds.

However, there exists clients that must manage many sources of authority
at once. Such a client *must* have means to explicitly designate which
authority it is using at each instant. For example, the client may
possess read-write permission to some file, but it may be unwilling to
use this while executing its current operation.

Our experience has been that relying on such clients to specify the
intended operation is not robust. The flow of permissions in complex
programs is not well localized, and it is very easy to write a
subroutine designed for one purpose that does some mildly dangerous
thing and then call it (by programmer error) in the middle of some
sequence of code where care is required.

Tying permissions to the object descriptor does not prevent the
programmer from passing the wrong descriptor, but it does help a great
deal in localizing the scope of programmer attention that is required to
resolve these problems.
...
...
Thus: I would (provisionally) prefer a microkernel in which there is an
'extra word' in the "recipient descriptor" (the replacement for
thread-id), and in which the unique sender identity is set by the thread
pager (which I assume is trusted).
The "extra word" is the proposed "Principal Id" in the send mapping. The 
pager is trusted with regard to the id space given to it. For example, 
if a pager has send mappings with Principal Id "2" it can map only 
forward mappings that begin with "2". Especially it can restrict the 
name space of the next pager "25".
This is a good start, but our experience is that hierarchical
restrictions of this form do not map well to the real problem space.
...
Attention, Jonathan! You are currently exposed to several different 
lines of ideas on how to overcome the current limitations of L4. This 
must be very confusing ...
Thanks for the warning. Yes, it is sometimes confusing. I do understand
that many ideas are "in the air."

shap