[I apologize in advance for both the length and the depth of this note.]
One of the long-term design differences between L4 and EROS is something I think of as the "threads vs. objects" distinction. In L4, all invocations are IPC invocations performed on thread ids. In EROS, all invocations are capability invocations performed on capability indexes. In the current EROS implementation, the capability indexes are an offset into a per-process capability table.
In my opinion, the original "name by thread-id" design in L3/L4 had two architectural issues:
1. It exposed the location of the target thread within the thread table. Because of the tight relationship between threads and tasks, this had the side effect of exposing potentially significant information about the internal structure of the recipient. It also made relocating a thread within the table challenging, because the thread's "address" was sitting in data space in every client, where one could not update it.
My opinion was that the thread-id was effectively an absolute address, and that it should have been a virtual address that was decoded w.r.t. a sender-defined name space.
2. Because the thread id's and task ids were inter-related, there seemed to be a number of awkward restrictions on allocation of task structures. In particular, it was necessary to know in advance how many threads (max) were going to be required, and it was possible for fragmentation to prevent successful allocation even when the necessary number of thread entries was available.
This seemed unnecessary. In addition, I thought it would be better if an address space could exist without any threads at all, and conversely if a thread could exist (though not execute) without any address space. From the EROS experience, I felt that non-process address spaces were useful. I'm not sure that threads without address spaces are useful, but I do believe that threads and address spaces are separable abstractions, and that it is good to keep them so.
From various conversations, I believe that the L4 team has addressed the
second issue, and is considering a number of solutions to the first, most notably "thread address spaces". In this approach, the current thread id is replaced by an offset (or address) into the thread address space, and an application supplies this thread address instead of the thread id of the previous design. If I understand matters correctly, this idea has not yet been incorporated into the current L4 implementation but there seems to be agreement that someday it should be.
I may have misunderstood, but I also believe that the power of two restriction on task allocation has been removed, and that any thread can now be associated with any task.
As long as there is no operation allowing a sender to *read* the real thread id out of the thread address space, then the current plan for L4+thread-spaces would implement something very close to the future EROS design. There are two differences:
1. The current EROS design is currently expressed in terms of capability registers. We have decided that this should be replaced with a capability address space. Conceptually, this capability address space serves the same role in our architecture that the thread address space serves in Ln.
2. The elements indexed by this address space are capabilities in EROS, but thread-ids in L4. This is the heart of why I think of this as the "threads vs. objects" distinction.
One way to think about this [the motivation will become clear in a moment] is that the L4 thread address space is actually a descriptor address space, but that all of the descriptors in this space are restricted to be thread descriptors. Since this appears to be the *only* remaining difference in this area of the Ln and EROS designs, I'ld like to make a case based on the EROS experience for why descriptors should be considered seriously in a future system.
First, switching to descriptors should not significantly change IPC performance. A descriptor is probably a larger data structure than a thread id, but in both cases there is a load instruction at the end to place the target thread address into a register. This is true in both a thread-id design and a descriptor design. The difference would be only that the thread structure pointer might live at a different offset.
Second -- and here I must draw for a moment from the EROS experience -- unifying all invocations behind a single descriptor abstraction has been incredibly useful in the EROS system. As with L4, we have a generalized invocation mechanism, but we can apply this mechanism to arbitrary objects.
Introducing a generalized descriptor notion into the architecture does NOT necessarily imply a large increase in kernel size. When Leendert van Dorn and I were considering kernel design at IBM, we concluded that most of the resulting kernel descriptor types can be served by well-known user mode servers if desired. The decision about whether or not to put support in the kernel for these object types becomes purely an engineering decision. The EROS implementation chose to place these implementations in the kernel, and in one or two cases this is probably essential, but in general it is NOT necessary.
The difference between a thread-id and a descriptor is that a descriptor takes the form
(resource-type-code, resource-name, type-specific-data-word)
where a resource-type-code is assigned for each kernel resource that is exported by means of a descriptor. Borrowing from EROS, the descriptor types that seem likely to be useful are:
void -- descriptor that responds with "unknown" to all requests flexpage -- name of a flexpage thread -- a thread-id (interface to the process abstraction) object -- a particular object served by some thread
In flexpages, the type-specific field holds permissions. In objects, it holds an object id. In threads, it might be unused.
The EROS implementation also includes linked list pointers so that outstanding descriptors can be efficiently revoked when a resource is destroyed, but this is a detail of implementation -- other representations are possible. On revocation, outstanding descriptors become void, but we have sometimes thought that it would be better to mark change them from "type X descriptor" to "type INVALID X descriptor". This would facilitate debugging.
An added word about the 'object' type may be useful, since it reflects a significant philosophical difference between L4 and EROS.
In EROS, a client invokes an object. This object may be an entire server (a process) or it may be a particular object that is served by that server. In L4, the "particular object" case is customarily handled by passing an object id as an argument to the IPC operation. One problem with this is that the object id can be forged by the client.
It is not entirely clear to me whether forged object ids are a serious problem in the absence of persistence, but my instinct is that they are *sometimes* a problem, and that including them is therefore desirable.
The "IPC indirection" design proposed by Trent Jaeger et al. many years ago included a similar uninterpreted word.
shap