Re: IPC/Capabilities Overview

31 Dec 2003

      Rudy:

Sorry for the delay in reply.

First, some corrections on specific statements that you made, because
they are potentially distracting from the real issues. Then I will try
to answer your questions.

Corrections:

1. L4 IPC speed has relatively little to do with the "direct lookup"
aspect of thread ids. Any indirect encoding will carry a cost, but in
terms of the overall performance of IPC this cost is quite small.

2. Security checks done in the server vs. the kernel are not necessarily
slower or faster. It depends greatly on what security checks you wish to
do. I argue that:

   a) *many* (not all) of the security checks currently done in L4
      servers could be eliminated if kernel-protected bits existed
      in each descriptor

   b) For some types of systems (capability systems), disclosure of
      the sender id is an absolute violation of design requirements,
      so any microkernel that relies exclusively on server-side
      security checks based on sender-id is not a universal microkernel.

   c) More specifically, any microkernel that requires checks based
      on sender-id is entirely unsuitable as a platform for EROS-NG.

Now to answer your questions:

There are really two things being controlled:

  1. Who can *send* to a given server
  2. How the server determines what actions a sender is permitted
     to invoke.

Ignoring clans and chiefs (which we all agree is too expensive and
inflexible), here is how the three schemes break down:

Thread IDs:
   No restriction who can send.
   Server makes decisions based on sender-id

Mapped Threads (Virtual Thread Objects)
   Sender can only invoke a thread that is mapped in their thread
      space.
   Server still makes most decisions based on sender-id

Capabilities:
   Sender can only invoke a thread descriptor that is mapped in their
      thread descriptor space (thread space)
   Server makes decisions based on a field that is encoded within the
      descriptor. No sender-id is transmitted.

And then, here is the hybrid mechanism that I have proposed:

Hybrid:
   Sender can only invoke a thread descriptor that is mapped in their
      thread descriptor space (thread space)
   Server makes decisions based on either (a) a field that is encoded
      within the descriptor, or (b) the sender-id.

   ** Sender-id is software controlled by the thread manager, and can
      be set to zero for all threads to simulate capability behavior.
...
* In a system that requires security the [virtual thread] and [capability]
methods are actually faster because only a single, simple kernel check
needs to be made against several complex user checks for [thread-IDs]
(is this really true? comments please).
I think this is true, but it is not so straightforward.

In the thread-IDs design, there are two distinguished phases in the
server-side security checks:

   1. Object resolution. Based on sender-id and arguments, determines
      the identity and permissions of the server-implemented object
      that has been invoked. This phase may conclude that no such
      object exists.

   2. Permissions check. Given the object identity and permissions,
      make a decision about whether the particular operation is to be
      permitted.

All of the bits needed for phase 2 can be encoded in the descriptor. All
of the expensive parts of the current L4 protocol lie in phase 1 (object
resolution).
...
Now I want to make clear why capablities are much better than virtual thread 
objects:
* The extra word does not seem to decrease performance in any way (is this 
true?) so it a free feature, that can be used but doesn't have to.
I believe that this is true, and the evidence of the EROS implementation
seems to support this view.
...
* The server defined word will probably be used to store a pointer to some 
client specific data structure containing important information that needs 
to be access often. You might say, yeah well mbut you can calculate this 
address from the sender ID, but this no longer works when clients start to 
grant and map server objects/capabilites to eachother, because the server 
doesn't know about these actions, unless some complex, slow protocol is used 
to update the serves information.
This is a possible usage. In our experience, the more common behavior is
to have a pointer to some data structure that describes a server
implemented object (i.e. has nothing to do with any particular client),
and reuse the low bits for permissions. For example, the pointer might
point to a file metadata structure, and the low bits might indicate read
and write permissions.
...
The size of the 
capability should be equal to the pointer/word size of the machine.
This can't be true, since the capability must encode both the
destination thread id and also the extra word. Minimum practical size is
two machine words.

shap

Re: IPC/Capabilities Overview

Jonathan S. Shapiro