[I apologize in advance for both the length and the depth of this note.]
One of the long-term design differences between L4 and EROS is something I think of as the "threads vs. objects" distinction. In L4, all invocations are IPC invocations performed on thread ids. In EROS, all invocations are capability invocations performed on capability indexes. In the current EROS implementation, the capability indexes are an offset into a per-process capability table.
In my opinion, the original "name by thread-id" design in L3/L4 had two architectural issues:
1. It exposed the location of the target thread within the thread table. Because of the tight relationship between threads and tasks, this had the side effect of exposing potentially significant information about the internal structure of the recipient. It also made relocating a thread within the table challenging, because the thread's "address" was sitting in data space in every client, where one could not update it.
My opinion was that the thread-id was effectively an absolute address, and that it should have been a virtual address that was decoded w.r.t. a sender-defined name space.
2. Because the thread id's and task ids were inter-related, there seemed to be a number of awkward restrictions on allocation of task structures. In particular, it was necessary to know in advance how many threads (max) were going to be required, and it was possible for fragmentation to prevent successful allocation even when the necessary number of thread entries was available.
This seemed unnecessary. In addition, I thought it would be better if an address space could exist without any threads at all, and conversely if a thread could exist (though not execute) without any address space. From the EROS experience, I felt that non-process address spaces were useful. I'm not sure that threads without address spaces are useful, but I do believe that threads and address spaces are separable abstractions, and that it is good to keep them so.
From various conversations, I believe that the L4 team has addressed the
second issue, and is considering a number of solutions to the first, most notably "thread address spaces". In this approach, the current thread id is replaced by an offset (or address) into the thread address space, and an application supplies this thread address instead of the thread id of the previous design. If I understand matters correctly, this idea has not yet been incorporated into the current L4 implementation but there seems to be agreement that someday it should be.
I may have misunderstood, but I also believe that the power of two restriction on task allocation has been removed, and that any thread can now be associated with any task.
As long as there is no operation allowing a sender to *read* the real thread id out of the thread address space, then the current plan for L4+thread-spaces would implement something very close to the future EROS design. There are two differences:
1. The current EROS design is currently expressed in terms of capability registers. We have decided that this should be replaced with a capability address space. Conceptually, this capability address space serves the same role in our architecture that the thread address space serves in Ln.
2. The elements indexed by this address space are capabilities in EROS, but thread-ids in L4. This is the heart of why I think of this as the "threads vs. objects" distinction.
One way to think about this [the motivation will become clear in a moment] is that the L4 thread address space is actually a descriptor address space, but that all of the descriptors in this space are restricted to be thread descriptors. Since this appears to be the *only* remaining difference in this area of the Ln and EROS designs, I'ld like to make a case based on the EROS experience for why descriptors should be considered seriously in a future system.
First, switching to descriptors should not significantly change IPC performance. A descriptor is probably a larger data structure than a thread id, but in both cases there is a load instruction at the end to place the target thread address into a register. This is true in both a thread-id design and a descriptor design. The difference would be only that the thread structure pointer might live at a different offset.
Second -- and here I must draw for a moment from the EROS experience -- unifying all invocations behind a single descriptor abstraction has been incredibly useful in the EROS system. As with L4, we have a generalized invocation mechanism, but we can apply this mechanism to arbitrary objects.
Introducing a generalized descriptor notion into the architecture does NOT necessarily imply a large increase in kernel size. When Leendert van Dorn and I were considering kernel design at IBM, we concluded that most of the resulting kernel descriptor types can be served by well-known user mode servers if desired. The decision about whether or not to put support in the kernel for these object types becomes purely an engineering decision. The EROS implementation chose to place these implementations in the kernel, and in one or two cases this is probably essential, but in general it is NOT necessary.
The difference between a thread-id and a descriptor is that a descriptor takes the form
(resource-type-code, resource-name, type-specific-data-word)
where a resource-type-code is assigned for each kernel resource that is exported by means of a descriptor. Borrowing from EROS, the descriptor types that seem likely to be useful are:
void -- descriptor that responds with "unknown" to all requests flexpage -- name of a flexpage thread -- a thread-id (interface to the process abstraction) object -- a particular object served by some thread
In flexpages, the type-specific field holds permissions. In objects, it holds an object id. In threads, it might be unused.
The EROS implementation also includes linked list pointers so that outstanding descriptors can be efficiently revoked when a resource is destroyed, but this is a detail of implementation -- other representations are possible. On revocation, outstanding descriptors become void, but we have sometimes thought that it would be better to mark change them from "type X descriptor" to "type INVALID X descriptor". This would facilitate debugging.
An added word about the 'object' type may be useful, since it reflects a significant philosophical difference between L4 and EROS.
In EROS, a client invokes an object. This object may be an entire server (a process) or it may be a particular object that is served by that server. In L4, the "particular object" case is customarily handled by passing an object id as an argument to the IPC operation. One problem with this is that the object id can be forged by the client.
It is not entirely clear to me whether forged object ids are a serious problem in the absence of persistence, but my instinct is that they are *sometimes* a problem, and that including them is therefore desirable.
The "IPC indirection" design proposed by Trent Jaeger et al. many years ago included a similar uninterpreted word.
shap
Jonathan S. Shapiro wrote:
- The current EROS design is currently expressed in terms of capability
registers. We have decided that this should be replaced with a capability address space. Conceptually, this capability address space serves the same role in our architecture that the thread address space serves in Ln.
- The elements indexed by this address space are capabilities in EROS,
but thread-ids in L4. This is the heart of why I think of this as the "threads vs. objects" distinction.
I think the EROS capability space and the L4 address spaces are the same: the difference is that the address space -- or better call it name space -- in L4 is partitioned per type: i.e. a name space exists for memory (the virtual address space), a second name space for threads is under discussion.
Both spaces contain capabilities, i.e. pointers to objects -- the thread / page frame -- complemented by a set of rights (read, write, execute). The difference is the way of how those capabilities itself are handled:
EROS has the take and grant_e function (_e to distinguish with the mapping grant: grant_m), L4 maps and grants capabilities and is able to directly revoke through unmap.
...
First, switching to descriptors should not significantly change IPC performance. A descriptor is probably a larger data structure than a thread id, but in both cases there is a load instruction at the end to place the target thread address into a register. This is true in both a thread-id design and a descriptor design. The difference would be only that the thread structure pointer might live at a different offset.
Global IDs have the benefit to directly translate to the TCB address (+- some masking). Local IDs bear a potential for a TLB and a Cache miss, thereby the smaller the descriptor the more descriptors fit into a cacheline and performance might be better with smaller descriptors. Note, this is just a guess and we have to evaluate this in particular for SMP systems where false sharing might effect the performance negatively.
...
An added word about the 'object' type may be useful, since it reflects a significant philosophical difference between L4 and EROS.
In EROS, a client invokes an object. This object may be an entire server (a process) or it may be a particular object that is served by that server. In L4, the "particular object" case is customarily handled by passing an object id as an argument to the IPC operation. One problem with this is that the object id can be forged by the client.
The question here is whether we have to support unforgeable object IDs in the kernel or whether unforgeable sender IDs as we have now are sufficient. Given an unforgeable sender ID, the invoked object ID can be validated to be accessible by the sender.
How is the object invoker identified in EROS? If I understand EROS right, you send a resume capability with the object invocation and have a compare and get_type operation on the capabilities. Can you make use of this capability to reliably distinguish calling clients?
...
Marcus
On Fri, 2003-12-05 at 10:32, Marcus Völp wrote:
I think the EROS capability space and the L4 address spaces are the same: the difference is that the address space -- or better call it name space -- in L4 is partitioned per type: i.e. a name space exists for memory (the virtual address space), a second name space for threads is under discussion.
They are not quite the same for reasons I will explain shortly, but I agree with your point. L4 address spaces are typed.
In EROS, once we move to capability address spaces, we will end up with a rather odd situation. The *data* address space will be typed (leaves must be page capabilities), but the capability address space is not (any capability may appear).
Our experience has been that generalizing the invocation mechanism to all types of objects has been tremendously useful.
EROS has the take and grant_e function (_e to distinguish with the mapping grant: grant_m), L4 maps and grants capabilities and is able to directly revoke through unmap.
These operations have similar names, but very very different functions. grant_e is a capability transmission. l4_map establishes a shared region mapping.
However, I believe that these can be unified -- note on this shortly.
First, switching to descriptors should not significantly change IPC performance. A descriptor is probably a larger data structure than a thread id, but in both cases there is a load instruction at the end to place the target thread address into a register. This is true in both a thread-id design and a descriptor design. The difference would be only that the thread structure pointer might live at a different offset.
Global IDs have the benefit to directly translate to the TCB address (+- some masking). Local IDs bear a potential for a TLB and a Cache miss, thereby the smaller the descriptor the more descriptors fit into a cacheline and performance might be better with smaller descriptors. Note, this is just a guess and we have to evaluate this in particular for SMP systems where false sharing might effect the performance negatively.
It is certainly true that local ids will impose a small performance penalty. The size of the leaf, in our experience, doesn't matter -- that cache line won't stay resident anyway. Speaking for myself, I would worry primarily that the descriptor should be cache aligned and that it should not exceed the cache line length.
I'm not sure the local id penalty is as high as you think -- we have some tricks in the EROS implementation that might apply.
However, global ids fail to provide protection in the absence of chiefs. If protection is important, the fair comparison is between chiefs and local ids, not between global ids and local ids. I believe that on balance, the indirection overheads are a reasonable price to pay to avoid the problems of the "receiver checks validity" approach. The problems I am thinking about are:
1. Hostile sender can flood the receiver. 2. Receiver permission checking in user mode is more expensive than one indirection.
I am not aware of any protected id design that avoids indirection entirely.
Ultimately, I believe that the right metric for IPC is not the trap to trap latency, but the latency between the client-side call and the server-side call (the serving procedure). Programs aren't trying to do IPC. They are trying to invoke services. It is the end to end cost that should be minimized.
In EROS, a client invokes an object. This object may be an entire server (a process) or it may be a particular object that is served by that server. In L4, the "particular object" case is customarily handled by passing an object id as an argument to the IPC operation. One problem with this is that the object id can be forged by the client.
The question here is whether we have to support unforgeable object IDs in the kernel or whether unforgeable sender IDs as we have now are sufficient. Given an unforgeable sender ID, the invoked object ID can be validated to be accessible by the sender.
So what you are saying is:
By eliminating a one word copy in the kernel, we can make it possible for the server to do a multi-instruction, multi-TLB-miss, multi-cache-line, multi-branch-stall lookup and validation at user level.
Whether this is a good answer will depend very much on the typical usage pattern. If most servers serve (a) multiple objects or (b) multiple interfaces, then the extra word in the kernel is justified. In our experience, almost *every* server serves multiple interfaces, and would therefore be forced to do validation. It is difficult to know whether this is a case of something that is fundamentally common, or whether our system design is simply optimized for our own unusual situation.
Transmitting sender identity is, in my view, potentially problematic. I need to think more about sender identities, but I believe that interface identities are tremendously important and are absolutely necessary for some of the higher-level systems that I want to build -- even if sender identities are transmitted.
How is the object invoker identified in EROS? If I understand EROS right, you send a resume capability with the object invocation and have a compare and get_type operation on the capabilities. Can you make use of this capability to reliably distinguish calling clients?
In EROS, we have never attempted (and never needed) to distinguish calling clients. Instead, we use capabilities to accomplish the same thing. A client receives a capability that is a
(thread-id, server-defined-field)
pair. The "server-defined-field" is unforgeable because it is kernel protected, and it is included in the outbound message. The field can be used for an object id, for an interface id (e.g. read-only vs. read-write), or some mixture.
It may be necessary to dispatch on sender ID, but in a capability system this would violate sender encapsulation.
Whether or not you are able to dispatch on sender id, doing so is insufficient to deal adequately with permissions. We (the EROS project) frequently need to deal with a situation where the same client holds simultaneously a read-write and a read-only capability to an object implemented by some server. Permissions are determined by which descriptor is invoked. I don't believe that dispatching on sender ID has sufficient power to discriminate this case.
Note that UNIX behaves similarly: a process can simultaneously hold a R/O and a R/W file descriptor to the same file.
Resume capabilities have nothing to do with any of this. They merely provide the server with transient authority to reply. The transience, by the way, is actually important from a security perspective.
shap
[Jonathan S Shapiro]
In my opinion, the original "name by thread-id" design in L3/L4 had two architectural issues:
- It exposed the location of the target thread within the thread table. [...]
- Because the thread id's and task ids were inter-related, there seemed to be a number of awkward restrictions on allocation of task structures. [...]
From various conversations, I believe that the L4 team has addressed the second issue, and is considering a number of solutions to the first, most notably "thread address spaces".
Yes. Second issue has been address in the Version X.2 API (i.e., the current L4Ka::Pistachio implementation). First issue would go away if threads were treated as virtual objects (which is something that we are thinking about).
I may have misunderstood, but I also believe that the power of two restriction on task allocation has been removed, and that any thread can now be associated with any task.
That is right. You can also migrate a thread between tasks (address spaces).
The difference between a thread-id and a descriptor is that a descriptor takes the form
(resource-type-code, resource-name, type-specific-data-word)
where a resource-type-code is assigned for each kernel resource that is exported by means of a descriptor. Borrowing from EROS, the descriptor types that seem likely to be useful are:
void -- descriptor that responds with "unknown" to all requests flexpage -- name of a flexpage thread -- a thread-id (interface to the process abstraction) object -- a particular object served by some thread
In flexpages, the type-specific field holds permissions. In objects, it holds an object id. In threads, it might be unused.
In L4 speak the object that we operate on is a "flexpage". A flexpage specifies a collection of objects in the current address space and treats this collection as a single object. The flexpage also contains a type. Currently there are three types defined:
o Page frame (i.e., memory) o I/O ports (ia32 only) o PCI Configuration addresses (ia64 only)
As you already hinted on, we (the Karlsruhe group) are also looking into treating threads as virtual objects. Thread IDs would then constitute another flexpage type. The type-specific-data-word for threads would be something like send and receive rights, thread create/delete rights, etc.
Using the new model, an actual address space would then consist of the regular memory address space, the thread space, and possibly the I/O space or PCI configuration space. Anyhow, the ideas about what we actually want is still a bit fuzzy, so you should take what I say with a grain of salt.
An added word about the 'object' type may be useful, since it reflects a significant philosophical difference between L4 and EROS.
In EROS, a client invokes an object. This object may be an entire server (a process) or it may be a particular object that is served by that server. In L4, the "particular object" case is customarily handled by passing an object id as an argument to the IPC operation. One problem with this is that the object id can be forged by the client.
It is not entirely clear to me whether forged object ids are a serious problem in the absence of persistence, but my instinct is that they are *sometimes* a problem, and that including them is therefore desirable.
Believe it or not, we have also internally been discussing/thinking about the issue of unforgeable IDs. But as you say, it is still unclear whether you really *need* to support this within the kernel. I'd really love to see a scenario that proves me wrong here
eSk
shap: [I apologize in advance for both the length and the depth of this note.] so do I
In Dresden, a group of us is discussing related issues since a while ...
I am convinced since a long time * that communication should *not* be addressed to threads, or - in other words - a service's thread structure should be completely transparent to clients. * that all resources exported by the kernel (address spaces, threads, memory, interrupt lines, ...) should have a local name (now in L4, only memory has local names -> virtual addresses). All types of resources provided by the micro kernel can be transfered from a "pager" to a "client" via IPC just like page faults are handled now in L4 (mapping and flushing). Then, all resources are initially owned by sigma0 and L4's elegant pager hierarchies can be easily extended to all (types of) resources. "Mapping data bases" are then needed for all types of resources as well. (We are discussing generalizations to the "sigma0" scheme as well. E.g., each pager might be able to provide a resource to the kernel to obtain a number of threads in exchange and then owns these threads. It then can act as thread pager for the clients. I leave further discussion of that out of this email to avoid confusion.)
We refer to that scheme as "generalized mapping". Here is an example for the usage: a server may discover that it would like to use an extra thread, it simply uses it by its local name ("start 57" or "set registers 57, ..."). If 57 is void (a void capability in EROS speak), a "thread fault" is discovered by the micro kernel and transformed to a message to the responsible pager who may or may not respond by providing a thread mapping.
(I see no benefit though of a "unified capability space". Memory "descriptors" are PTEs.)
IPC provides the means for communication between units of protection. Units of protection in L4 are address spaces. Thus, messages should be sent from one address space to another using local names for address spaces. Such "send mappings" (could be called "send capabilities" for address spaces) are all owned by sigma0, can be mapped and flushed. For example, sending a message to 57 results(if void) in a "send mapping fault" to the responsible pager who may or may not respond by providing a send mapping. Arbitrary message interception structures can be established that way. No need to consider thread structures of services from the point of view of any client, be it an "original" client or an interceptor.
(I see no benefit though of restricting communication to "calls on objects". User-level SW of course can restrict communication to such patterns, but why should a micro-kernel enforce such restriction? Dresden's version of an IDL compiler is designed to support arbitrary message patterns. User-land "objects" should remain a user-land issue.)
The question now is: how does a receiver identify the sender? J.S. says: no need to identify since there are capabilities. I disagree here: relying just on capabilities leads to much-too-fine grained clustering of protection units. From my point of view, we need to enable protection units to securely find out from which clients messages came in, or - in other words - to serve clients with fairly different rights.
In Dresden, we discussed many schemes by which a receiver can identify a sender, here are my two current favorites (alternatives): * Identification via local names for "Ports": each address space (unit of protection) has several ports. Ports are entrance points into the address spaces; threads of an address space wait for messages on ports; ports are resources provided by the kernel and handled at user level using mapping and flushing operations. A send mapping (for example send mapping with local name 57) has the form (address space, Port-Range) where "Port-Range" is an interval in the local name of the addressee. All threads within an address space can wait on all its ports. Senders are identified via local name of the port a message comes in. (Some tricky details of this scheme omitted here. "Ports" are close to MACH ports, notable differences: synchronous IPC, flushing, ...) * Identification via "Principal ID"s of senders. They are enforced by the kernel(as with physical thread ids in L4), but managed at user level. A send mapping (for example send mapping with local name 57) has the form (address space, Principal ID: LENGTH,VALUE) IPC may then send out a message M by calling: send (57, M). The kernel assigns "57.VALUE" to the first part of M where the "first part" has length "57.LENGTH". Again, send mappings are passed from pagers to clients. While doing so, LENGTH may increase, but not decrease. That way, senders further down a pager hierarchy can be restricted in their "Principal ID" space or - in other words - in which content the first part of messages may contain. Almost arbitrary user-level naming schemes can be provided. (I omit to discuss the "closed wait" problems that need to be addressed in this scheme.)
--hermann
On Sun, 2003-12-07 at 10:10, Hermann Härtig wrote:
shap: [I apologize in advance for both the length and the depth of this note.] so do I
In Dresden, a group of us is discussing related issues since a while ...
I am convinced since a long time
- that communication should *not* be addressed to threads, or - in other
words - a service's thread structure should be completely transparent to clients.
We have certainly found this helpful in EROS. If nothing else, it lets us save a system image, replace the kernel, and restart it without logical interruption of execution. I'm not sure how relevant my example is to L4, but there are many reasons why the implementation details of kernel structures should be opaque.
As a middle ground, I would qualify that by saying: opaque with the possible exception that pagers for these structures (if any) will certainly need to know about them.
- that all resources exported by the kernel (address spaces, threads,
memory, interrupt lines, ...) should have a local name (now in L4, only memory has local names -> virtual addresses)...
I believe that we are in agreement, but I want to confirm that I have understood you correctly. I believe you mean:
all resources exported by the kernel (address spaces, threads, memory, interrupt lines, ...) should be referenced by processes using names that are local to the process.
The distinction I am trying to be clear about is this:
Real resources intrinsically have global names. Physical pages, for example, have names that are dictated by the hardware wiring of the memory subsystem. Every thread similarly has a unique global identity.
I think what we are both saying is that such global names for kernel resources should not be revealed outside the kernel. If so, we are in agreement.
Here again I would make a potential exception for pagers. At certain very low levels of the system, there is application code that can and should know something about the mapping from local names to global names within its own context.
All types of resources provided by the micro kernel can be transfered from a "pager" to a "client" via IPC just like page faults are handled now in L4 (mapping and flushing). Then, all resources are initially owned by sigma0 and L4's elegant pager hierarchies can be easily extended to all (types of) resources.
If I understand you correctly, then I agree, but it is important (at least for the moment) to speak very precisely about what is meant by the statement you have made.
The item transferred is not the resource. The resource exists and has unique identity. The item transferred is the (protected) global name of the resource, and the result of the transfer is the establishment in the recipient context of a new name, local to the recipient, that maps to this resource.
Is this what you meant?
"Mapping data bases" are then needed for all types of resources as well.
The mapping model is the single biggest difference between EROS and L4. I would like to defer consideration of extending the mapping database until we agree (or agree to disagree) whether the current L4 mapping model is really the one that we ultimately want. I see definite strengths in the current L4 model, but I also see definite weaknesses.
(I see no benefit though of a "unified capability space". Memory "descriptors" are PTEs.)
It is a question of naming. Let me set aside memory descriptors, as this is going to need to be a subject for careful discussion.
The question is: given that there exist local names for all resources, should there be a common invocation mechanism that is used to invoke the interfaces on these resources?
If so, then we want a logically unified capability space, because we want a logically unified notion of local name. On most platforms, the PTE is not an effective data structure for this purpose.
(I see no benefit though of restricting communication to "calls on objects". User-level SW of course can restrict communication to such patterns, but why should a micro-kernel enforce such restriction?
It is not a restriction. It is a generalization. The behavior of L4 with local names is a strict subset of the behavior of EROS start capabilities. See below in my comments about compatibility.
Dresden's version of an IDL compiler is designed to support arbitrary message patterns. User-land "objects" should remain a user-land issue.)
I agree that user-land objects should largely remain a user-land issue. The issue is that if the user-land server has no efficient means to implement a protected name for a user-land object, significant power and performance are lost.
The question now is: how does a receiver identify the sender? J.S. says: no need to identify since there are capabilities. I disagree here: relying just on capabilities leads to much-too-fine grained clustering of protection units.
Based on 30 years of experience with KeyKOS and EROS, I would say that the evidence does not entirely agree with you.
However, let us be careful what we mean: I am not saying that unforgeable sender identity is bad (though I do believe this). I am saying that it is insufficient. You have not yet proposed a response to my question about how a single sender can simultaneously hold both a logically read-only descriptor and a logically read-write descriptor if there is no way for the server to efficiently differentiate the type of the descriptor that was invoked.
From my point of view, we need to enable protection units to securely find out from which clients messages came in, or - in other words - to serve clients with fairly different rights.
There are system designs for which this is true. There are also system designs for which it is NOT true. I disagree with this statement on three grounds:
1. It favors a particular set of policies -- those based on invoker identity -- in the microkernel. It strongly disfavors pure capability policies.
2. It favors an access control model that has been repeatedly demonstrated in formal analysis to be unsafe.
3. It assumes that clients with "fairly different rights" cannot be effectively served through any means other than a unique client identity. Based on our experience in EROS/KeyKOS this is clearly mistaken.
I will not (and do not) argue that the existing L4 work that builds on sender identity should be discarded. I believe that it is founded on a known unsafe protection model, and I would encourage giving some attention to whether my belief might be correct, but I'm prepared to believe that I may be mistaken, and I believe that there exist systems we wish to build on top of the microkernel that require this function in order to execute efficiently.
However, I DO argue that a good microkernel should not strongly disfavor a protection model that is known to be sound in favor of one that is known to be unsound.
Thus: I would (provisionally) prefer a microkernel in which there is an 'extra word' in the "recipient descriptor" (the replacement for thread-id), and in which the unique sender identity is set by the thread pager (which I assume is trusted).
Given such a design, all of the current L4 servers can continue to execute unmodified by simply ignoring the extra word, and all of the current EROS servers can execute unmodified by ignoring the unique sender ID. In EROS, for reasons of protection, our thread pager would deliberately zero this field.
I will defer comment on the ports issue until I have time to think about it some more.
shap
Jonathan S. Shapiro wrote:
I am convinced since a long time
- that communication should *not* be addressed to threads, or - in other
words - a service's thread structure should be completely transparent to clients.
...
As a middle ground, I would qualify that by saying: opaque with the possible exception that pagers for these structures (if any) will certainly need to know about them.
Sure. Pagers are not clients.
all resources exported by the kernel (address spaces, threads, memory, interrupt lines, ...) should be referenced by processes using names that are local to the process.
That is what I tried to say.
The item transferred is not the resource. The resource exists and has unique identity. The item transferred is the (protected) global name of the resource, and the result of the transfer is the establishment in the recipient context of a new name, local to the recipient, that maps to this resource.
Is this what you meant?
Yes, thanks for the clarification!
(I see no benefit though of restricting communication to "calls on objects". User-level SW of course can restrict communication to such patterns, but why should a micro-kernel enforce such restriction?
It is not a restriction. It is a generalization. The behavior of L4 with local names is a strict subset of the behavior of EROS start capabilities. See below in my comments about compatibility.
This I do not understand (-> note for your tutorial in Dresden).
Dresden's version of an IDL compiler is designed to support arbitrary message patterns. User-land "objects" should remain a user-land issue.)
I agree that user-land objects should largely remain a user-land issue. The issue is that if the user-land server has no efficient means to implement a protected name for a user-land object, significant power and performance are lost.
In our thinking, the unit of protection is an address space. Why need anything else but unforgable sender ids plus control of IPC via send mappings?
The question now is: how does a receiver identify the sender? J.S. says: no need to identify since there are capabilities. I disagree here: relying just on capabilities leads to much-too-fine grained clustering of protection units.
Based on 30 years of experience with KeyKOS and EROS, I would say that the evidence does not entirely agree with you.
My intuition is otherwise. I need to learn much more on EROS. Unfortunately at this point, my time is too limited already to follow the current discussion carefully. Hope, your tutorial in Dresden will help (I managed to get rid of almost all obligations for that week ...).
However, let us be careful what we mean: I am not saying that unforgeable sender identity is bad (though I do believe this). I am saying that it is insufficient.
You are right; we know that.
You have not yet proposed a response to
my question about how a single sender can simultaneously hold both a logically read-only descriptor and a logically read-write descriptor if there is no way for the server to efficiently differentiate the type of the descriptor that was invoked.
Two answers: 1) This is simple if we - as we discuss in Dresden - associate ids ("Principal Id") with a send mapping (what you call a descriptor). The kernel enforces that a message contains the "Principle ID" as the first part of a message (instead of a sender id based on threads ids). Then you can associate one id with the "logically read-only descriptor" and another with the "logically read-write descriptor". 2) The scenario is not interesting. If a single sender holds "both a logically read-only descriptor and a logically read-write descriptor" there is no way to stop a sender to use the descriptor granting more rights. The descriptor cannot base anything on the knowledge which of the two descriptors has been used.
From my point of view, we need to enable protection units to securely find out from which clients messages came in, or - in other words - to serve clients with fairly different rights.
There are system designs for which this is true. There are also system designs for which it is NOT true. I disagree with this statement on three grounds: ...
This should be part of your tutorial in Dresden.
Thus: I would (provisionally) prefer a microkernel in which there is an 'extra word' in the "recipient descriptor" (the replacement for thread-id), and in which the unique sender identity is set by the thread pager (which I assume is trusted).
The "extra word" is the proposed "Principal Id" in the send mapping. The pager is trusted with regard to the id space given to it. For example, if a pager has send mappings with Principal Id "2" it can map only forward mappings that begin with "2". Especially it can restrict the name space of the next pager "25".
Attention, Jonathan! You are currently exposed to several different lines of ideas on how to overcome the current limitations of L4. This must be very confusing ...
--hermann
On Mon, 2003-12-08 at 12:31, Hermann Härtig wrote:
Jonathan S. Shapiro wrote:
(I see no benefit though of restricting communication to "calls on objects". User-level SW of course can restrict communication to such patterns, but why should a micro-kernel enforce such restriction?
It is not a restriction. It is a generalization. The behavior of L4 with local names is a strict subset of the behavior of EROS start capabilities. See below in my comments about compatibility.
This I do not understand (-> note for your tutorial in Dresden).
By all means we should discuss, but let me attempt to clarify.
ALL communications are invocations on objects. The only questions that exist in principle are:
1. What restrictions are imposed on the TYPE of object that can be invoked?
2. Is the object namespace extensible by user-mode code. That is, can user-mode servers present objects or interfaces that appear to the invoker to be "first class" in the same sense that kernel supported objects are first class.
L4 imposes the restriction that the only invocable object type is "process" (or in some cases thread).
L4 does *not* (today) provide means to allow a server to extend the object name space.
My first point is probably self-explanatory, so I will expand only on the second.
In L4, if a client wishes to perform an operation on a file, the "name" of the file must be passed as an argument to an IPC. The invocation is something like:
file_server->invoke(file-id, operation-id, ... other args ...)
Because "file" is not a kernel-supported object, the protocol mandates that the sender provide an additional argument in the IPC invocation. In the EROS philosophy, we would argue that these objects are therefore "second class" and that this is bad for several reasons:
1. The invoker should not know the server identity. That should be known only to the file object. 2. It is difficult to transparently virtualize objects when their invocation patterns are different.
Next problem:
The server must then run some function:
get_permissions(sender-id, file-id) -> permissions
to determine what operations are permitted. Note that if this operation is performed faithfully and correctly, it is impossible to emulate correctly the behavior of the UNIX I_SENDFD socket operation without many additional calls to a shared service -- the design of the operation makes descriptor transfer an inherently expensive operation. Descriptors, which *can* be used as a foundation for certain kinds of security, suddenly become extremely inefficient because they cannot be passed without consulting a third party.
I do not argue that descriptors are the only form of security. I argue (from experience) that they are one important kind, and that a good microkernel design should not intrinsically penalize them.
In EROS, the corresponding invocation would be:
file_capability->invoke(operation-id, ... other args ...)
That is, the fact that "file" is not a kernel-supported object does not. The "file_capability" contains within it a pair of the form:
(process-id, server-defined-bits)
The server-defined-bits portion cannot be examined by the client. It is provided to the server during invocation. The server can interpret these bits in any way desired: as an object id, as a facet id, as permission bits, as some mix of these.
The presence of these bits does not preclude invocation of the server qua server; the server merely assigns to itself an arbitrarily chosen value of "server-defined-bits" to name the server itself.
Dresden's version of an IDL compiler is designed to support arbitrary message patterns. User-land "objects" should remain a user-land issue.)
I agree that user-land objects should largely remain a user-land issue. The issue is that if the user-land server has no efficient means to implement a protected name for a user-land object, significant power and performance are lost.
In our thinking, the unit of protection is an address space. Why need anything else but unforgable sender ids plus control of IPC via send mappings?
Because these are architecturally insufficient to implement an efficient, secure, object-based operating system.
You have not yet proposed a response to
my question about how a single sender can simultaneously hold both a logically read-only descriptor and a logically read-write descriptor if there is no way for the server to efficiently differentiate the type of the descriptor that was invoked.
Two answers:
- This is simple if we - as we discuss in Dresden - associate ids
("Principal Id") with a send mapping (what you call a descriptor). The kernel enforces that a message contains the "Principle ID" as the first part of a message (instead of a sender id based on threads ids). Then you can associate one id with the "logically read-only descriptor" and another with the "logically read-write descriptor".
I look forward to hearing more about tihs.
- The scenario is not interesting. If a single sender holds "both a
logically read-only descriptor and a logically read-write descriptor" there is no way to stop a sender to use the descriptor granting more rights. The descriptor cannot base anything on the knowledge which of the two descriptors has been used.
This is correct, but it is an insufficient understanding. If we imagine that the sender is compromised, you are correct that the sender is in a position to use the most powerful descriptor that it holds.
However, there exists clients that must manage many sources of authority at once. Such a client *must* have means to explicitly designate which authority it is using at each instant. For example, the client may possess read-write permission to some file, but it may be unwilling to use this while executing its current operation.
Our experience has been that relying on such clients to specify the intended operation is not robust. The flow of permissions in complex programs is not well localized, and it is very easy to write a subroutine designed for one purpose that does some mildly dangerous thing and then call it (by programmer error) in the middle of some sequence of code where care is required.
Tying permissions to the object descriptor does not prevent the programmer from passing the wrong descriptor, but it does help a great deal in localizing the scope of programmer attention that is required to resolve these problems.
Thus: I would (provisionally) prefer a microkernel in which there is an 'extra word' in the "recipient descriptor" (the replacement for thread-id), and in which the unique sender identity is set by the thread pager (which I assume is trusted).
The "extra word" is the proposed "Principal Id" in the send mapping. The pager is trusted with regard to the id space given to it. For example, if a pager has send mappings with Principal Id "2" it can map only forward mappings that begin with "2". Especially it can restrict the name space of the next pager "25".
This is a good start, but our experience is that hierarchical restrictions of this form do not map well to the real problem space.
Attention, Jonathan! You are currently exposed to several different lines of ideas on how to overcome the current limitations of L4. This must be very confusing ...
Thanks for the warning. Yes, it is sometimes confusing. I do understand that many ideas are "in the air."
shap
On Dec 11, 2003, at 18:57, Jonathan S. Shapiro wrote:
In EROS, the corresponding invocation would be:
file_capability->invoke(operation-id, ... other args ...)
That is, the fact that "file" is not a kernel-supported object does not. The "file_capability" contains within it a pair of the form:
(process-id, server-defined-bits)
The server-defined-bits portion cannot be examined by the client. It is provided to the server during invocation. The server can interpret these bits in any way desired: as an object id, as a facet id, as permission bits, as some mix of these.
The presence of these bits does not preclude invocation of the server qua server; the server merely assigns to itself an arbitrarily chosen value of "server-defined-bits" to name the server itself.
How does the EROS model support multiprocessor object invocation?
Given that L4 invocations directly address threads, a client can address a server's CPU-local thread, independent of the object type. The thread space provides a dimension of optimization which is independent of the object identification space. Thus a client can load-balance (by first choosing a client thread on the same CPU as the server thread), or the client can let the server load-balance (and thus choose the server thread which executes on the same CPU as the client thread). This is generally necessary on ia32 hardware for high speed communication, due to rather expensive xCPU IPC. With non-ia32 hardware (such as the dual core MIPS SiByte), xCPU IPC can be blazing fast, and user-level processes optimized for the processor can choose to support xCPU IPC. If the kernel were to distribute object invocations to a server's per-CPU threads, it would be implementing a policy, and one which might not satisfy the user tasks.
I'm highlighting this multiprocessor problem as an example of why one might want to directly address thread IDs, rather than object IDs. My intuition is that separating the thread ID space from the object ID space offers more versatility. I can imagine solutions for EROS, such as allocating one capability per processor, but this overloads the concept of a capability, does it not?
-Josh
On Fri, 2003-12-12 at 08:34, Joshua LeVasseur wrote:
How does the EROS model support multiprocessor object invocation?
I will answer below, but I believe that this is entirely orthogonal to the question of object-id bits. The proposed "object id" bits are merely opaque bits that are carried in the descriptor. The "recipient server bits" [note that I am avoiding the word thread intentionally] are still present and their handling does not change.
Given that L4 invocations directly address threads, a client can address a server's CPU-local thread, independent of the object type.
I believe that you have the desirability of this exactly backwards. What you are appear to be saying is:
In order to address the CPU-local thread of a service, the thread-id of that thread should be revealed to the client.
I will describe below why this is unnecessary. First, let me identify why it is undesirable:
1. Migration of the client causes it to invoke the wrong service thread, because it now holds an inappropriate ID and has no way to discover this.
2. The fact that the service is multithreaded (at all) is exposed to the client. This is a violation of encapsulation. It is unnecessary and unjustified.
3. CPU scheduling information is unnecessarily revealed (which is a covert channel).
In short, this is exactly a case where a default policy *does* belong in the kernel, because the client is not in a position to have up to date knowledge without exposing a whole bunch of problems.
Thus a client can load-balance (by first choosing a client thread on the same CPU as the server thread), or the client can let the server load-balance (and thus choose the server thread which executes on the same CPU as the client thread).
Hold on! I agree that the client should sometimes load balance, but it must first be the server's decision to reveal that it is multithreaded at all. If the server wishes to do so, it is free to provide multiple descriptors/thread-ids to the client. If the server does not, it does not reveal multiple IDs of this form. Note that NONE of the dispatching depends on the bits of the thread-id per se. All of what you describe can be equally well accomplished by invoking descriptors selectively. Fundamentally, what is needed is a function held in client memory of the form:
f(descriptor-id) -> CPU ID
The descriptor-id can either be an address in the thread/descriptor address space or it can be the thread-id. Precisely *because* it doesn't matter, I would strongly advocate that it should NOT be the thread-id because this violates encapsulation.
If the kernel were to distribute object invocations to a server's per-CPU threads, it would be implementing a policy, and one which might not satisfy the user tasks.
This is regrettably true. However, I want to point out there is a strong tension here between speed on the one hand and (portability, encapsulation) on the other. You are making an argument that is entirely motivated by performance. For many systems, this is NOT the primary design consideration. Choosing speed uber alles is also policy, and it is a failure of microkernel architecture to impose an interface that prevents the efficient realization of systems that place other priorities first.
I'm highlighting this multiprocessor problem as an example of why one might want to directly address thread IDs, rather than object IDs.
I hope my comments above reveal that (a) your goal is sound, but (b) your goal does not in any way require revealing thread-ids.
I can imagine solutions for EROS, such as allocating one capability per processor, but this overloads the concept of a capability, does it not?
In my view, it does not. A capability internally names a pair of the form:
(representation-state, interface signature)
It is perfectly okay to hold two capabilities that name the same representation state and interface (method set). The two capabilities might have different behavior with respect to the communication transport, but if they have the same operational semantics, they are the "same" capability (we are getting here into the question "what does it mean to test two objects for equality?").
shap
Jonathan S. Shapiro wrote:
On Mon, 2003-12-08 at 12:31, Hermann Härtig wrote:
ALL communications are invocations on objects. The only questions that
This "credo" clarifies the fundamental difference.
In L4 thinking, "objects" is a user-level abstraction. L4 supports a fixed minimal set of resources provided by the kernel: task(addresspace), thread, pages, ... . Mappings correspond to such kernel resources. Everything else is "second class", i.e. built on top of it using IPC. Then, allowing just "object"-invocation-like IPC(synchronous calls) is a restriction compared to arbitrary communication patterns.
In Eros, everything is and must be "invocations on objects" regardless of whether they are provided by the kernel or in user-land to the extent that user-level objects are represented by some kernel entity. Then, allowing invocation on kernel-objects only is a restriction.
Here, it is my strong belief that L4 (with all its problems that still need to be resolved) has it right! "Objects" are user-land business to the extend that clever compilers may decide to optimize away certain "object" boundaries. I have to emphasize the *belief* in the previous paragraph.
Other differences fall out of that fundamental difference:
L4 imposes the restriction that the only invocable object type is "process" (or in some cases thread).
"It is a generalization, not a restriction."
L4 does *not* (today) provide means to allow a server to extend the object name space.
But is allows servers to build arbitrary name spaces on top of L4. It is not kernel business to provide a name space for user-land objects. Name spaces are often defined by user-level standards (e.g., file ids).
- The invoker should not know the server identity. That should be known only to the file object.
- It is difficult to transparently virtualize objects when their invocation patterns are different.
We need means to transparently intercept IPC (we have several different proposals for that in the L4 community). And that is all that is needed. Arbitrary invocation patterns can be built and virtualized in user-land if IPC is really fast.
Descriptors, which *can* be used as a foundation for certain kinds of security, suddenly become extremely inefficient because they cannot be passed without consulting a third party.
Caching solves that problem. I have no problem with "descriptors". But descriptors in a micro kernel should be descriptors (mappings) on kernel resources, not "objects". The kernel can do nothing to protect "objects" since they are in the hands and at the mercy of user-land. Therefore, no need to have any notion of "objects" in the kernel.
Note the "certain kinds of security" for your tutorial in Dresden.
(process-id, server-defined-bits)
Despite the fundamental difference, the implementation ideas seem somehow similar. The sender id mechanism in the generalized mapping discussions can (at first view) be used to build server-defined-bits on top of it (see my previous email). Simply, let the server act as pager providing for the clients. But sender ids allow many more schemes.
In our thinking, the unit of protection is an address space. Why need anything else but unforgable sender ids plus control of IPC via send mappings?
Because these are architecturally insufficient to implement an efficient, secure, object-based operating system.
I do not buy this. Here is the fundamental difference again.
This is a good start, but our experience is that hierarchical restrictions of this form do not map well to the real problem space.
Hm?
--hermann
On Fri, 2003-12-12 at 12:33, Hermann Härtig wrote:
Descriptors, which *can* be used as a foundation for certain kinds of security, suddenly become extremely inefficient because they cannot be passed without consulting a third party.
Caching solves that problem...
I believe it does not.
Caching solves the problem only if descriptors are long-lived and used many times within a relatively short temporal time frame. This is not the case in current EROS usage. A typical interaction between two user-level objects may involve only two to four invocations, which is not enough to amortize the cost of caching the answers. If a caching approach is adopted, the end to end cost of this caching is many MULTIPLES of our current end to end time.
Someone proposed using a shared-memory cache of some sort, but this is not an option, because it violates confinement.
However, let me emphasize that I am completely willing to give up the particular mechanism proposed if we can solve the underlying problem efficiently in some other way.
shap
l4-hackers@os.inf.tu-dresden.de