Corrections:
- L4 IPC speed has relatively little to do with the "direct lookup"
aspect of thread ids. Any indirect encoding will carry a cost, but in terms of the overall performance of IPC this cost is quite small.
While this is indeed also my believe, it unfortunately isn't that of the L4 team. If you look at their presentation their two main points are: "IPC needs to be infinitly fast". The problem is that they only look at direct costs of cycles spend on entering the kernel, doing IPC and exiting tthe kernel. They also look at indirect costs of TLB and cache misses caused by IPC, but strangely enough won't look at the cost of checking access rights (probalby because access checks are no longer part of the microkernel, but of operating system policy). If you want to convince them, you have to make sure they eiter count those costs also OR make sure they believe the added IPC costs of capabilities are indeed neglectable. As far as I could reason these added (compared to the thread-id methode) costs would be:
- One extra register spilled on the receiver side (to store the server defined word) - One extra memory access (to convert the capability into the server thread id) [VTO] - One extra memory access (to load the server defined word) - One or more extra memory accesses (to locate the server defined word and server thread id in the Thread Object Space) [VTO]
This goes in against their believe that "A computers get faster memory acess get relativly slower, therefore memory access should be avoided during IPC". I believe only 2 memory accesses are needed on RISC machines with large register sets, so the 3+ extra accesses are quit a burden (through the L4 developer's eyes). However they are considering Virtual Thread Objects and compared to that only 1 extra memory access is required (this isn't a lost comparing against the 4+ that already need to be done), because the steps marked with [VTO] needs to bee done in that model also.
b) For some types of systems (capability systems), disclosure of the sender id is an absolute violation of design requirements, so any microkernel that relies exclusively on server-side security checks based on sender-id is not a universal microkernel.
I never realised this, and I still don't see why this a design requirement, but if there is one this would again be a major argument in favor of capabilities. Could you give me an example of when not reveiling sender-id is useful?
Hybrid: Sender can only invoke a thread descriptor that is mapped in their thread descriptor space (thread space) Server makes decisions based on either (a) a field that is encoded within the descriptor, or (b) the sender-id.
** Sender-id is software controlled by the thread manager, and can be set to zero for all threads to simulate capability behavior.
Could you explain a bit more about this thread manager. It seems to solve a problem in which the sender id does needs to be know. For example if a thread tries to write to a file, which it hasn't got write access for, it may be a system requirement that this is logged into the security log. Now you also want to log which thread actualy made the violation. If the sender-id isn't reveiled to the file server it can only log the thread-id (no not thread-id, something else but what?) to which it originally mapped the capability. Maybe if a thread manager is used the true sender-id can be logged withour reveiling it so the file server itself?
- The server defined word will probably be used to store a pointer to
some
client specific data structure containing important information that
needs
to be access often. You might say, yeah well mbut you can calculate this address from the sender ID, but this no longer works when clients start
to
grant and map server objects/capabilites to eachother, because the
server
doesn't know about these actions, unless some complex, slow protocol is
used
to update the serves information.
This is a possible usage. In our experience, the more common behavior is to have a pointer to some data structure that describes a server implemented object (i.e. has nothing to do with any particular client), and reuse the low bits for permissions. For example, the pointer might point to a file metadata structure, and the low bits might indicate read and write permissions.
I used some strange wording in my explanation but a meant what you just said.
The size of the capability should be equal to the pointer/word size of the machine.
This can't be true, since the capability must encode both the destination thread id and also the extra word. Minimum practical size is two machine words.
Sorry this was a typo, I Actually meant to say the size of the server defined word inside the capability shouls be aqeual to the machine word size, making the total capability size, indeed, at least 2 machine words in size.
-- Rudy Koot
_________________________________________________________________ MSN Zoeken helpt je om de gekste dingen te vinden! http://search.msn.nl
I have already responded to the substance of Rudy's note, so just a few brief points here.
On Wed, 2003-12-31 at 11:39, Rudy Koot wrote:
This goes in against their believe that "A computers get faster memory acess get relativly slower, therefore memory access should be avoided during IPC".
Based on history of processor architecture over the last 30 years, this belief is very well motivated. The problem is likely to get worse, not better.
b) For some types of systems (capability systems), disclosure of the sender id is an absolute violation of design requirements, so any microkernel that relies exclusively on server-side security checks based on sender-id is not a universal microkernel.
I never realised this, and I still don't see why this a design requirement, but if there is one this would again be a major argument in favor of capabilities. Could you give me an example of when not reveiling sender-id is useful?
I find this question very puzzling, mainly because the assumption set behind it is so different from my own.
In a capability system, the *only* question that the server can justifiably ask is "what capability was invoked?" It is an ERROR for the server to consider any other factor in making decisions. Therefore, it is an ERROR to consider the sender-id. Given this, the exposure of sender-id is a violation of the protection model, because it improperly discloses to the server information that it does not require and therefore is not entitled to have.
Hybrid: Sender can only invoke a thread descriptor that is mapped in their thread descriptor space (thread space) Server makes decisions based on either (a) a field that is encoded within the descriptor, or (b) the sender-id.
** Sender-id is software controlled by the thread manager, and can be set to zero for all threads to simulate capability behavior.
Could you explain a bit more about this thread manager. It seems to solve a problem in which the sender id does needs to be know. For example if a thread tries to write to a file, which it hasn't got write access for, it may be a system requirement that this is logged into the security log.
I am imagining a design in which "thread-id" does not correspond to thread PCB address, but rather to some global unique identifier set by the thread manager. This, of course, is very different from the current L4 design.
If the value lives in a field in the per-thread PCB, then it can be set arbitrarily by the thread manager.
If the sender-id isn't reveiled to the file server it can only log the thread-id (no not thread-id, something else but what?) to which it originally mapped the capability.
It cannot even log that, because it does not know that the original recipient is the current invoker. It can only log the identity of the capability that was used. Other mechanisms are needed to deal with tracking capability transfer.
Logging the sender-id is actually a very bad mechanism, because it assumes that the sender is implemented as a single, monolithic application. In EROS this is rarely true.
shap
-----Original Message----- From: l4-hackers-bounces@os.inf.tu-dresden.de [mailto:l4-hackers-bounces@os.inf.tu-dresden.de] On Behalf Of Jonathan S. Shapiro Sent: Wednesday, December 31, 2003 4:37 PM To: rudykoot@mithrill.org Cc: L4 Hackers List Subject: Re: IPC/Capabilities Overview
I have already responded to the substance of Rudy's note, so just a few brief points here.
On Wed, 2003-12-31 at 11:39, Rudy Koot wrote:
This goes in against their believe that "A computers get
faster memory
acess get relativly slower, therefore memory access should be
avoided during IPC".
Based on history of processor architecture over the last 30 years, this belief is very well motivated. The problem is likely to get worse, not better.
Yes, memories get *bigger* but not faster. So memory hierarchies get deeper. I wonder if operating systems shouldn't have a "deeper hierarchy" as well ... A "nanokernel" that lives in the processor and its registers, a "microkernel" that lives in the level 1 cache, etc.
On Wed, 2003-12-31 at 22:20, M. Edward Borasky wrote:
On Wed, 2003-12-31 at 11:39, Rudy Koot wrote:
This goes in against their believe that "A computers get faster memory acesss get relativly slower, therefore memory access should be avoided during IPC".
Based on history of processor architecture over the last 30 years, this belief is very well motivated. The problem is likely to get worse, not better.
Yes, memories get *bigger* but not faster. So memory hierarchies get deeper. I wonder if operating systems shouldn't have a "deeper hierarchy" as well ... A "nanokernel" that lives in the processor and its registers, a "microkernel" that lives in the level 1 cache, etc.
Actually, this isn't the reason that memory gets slower. The reason is that clock rate determines levels of logic, and levels of logic determines L1 cache size.
But there is hope coming. It turns out that we *do* know how to build main memories that run at current L2 cache speeds or better, and if you run the weighted memory reference times you'll discover that the best way to speed up a modern processor is by improving miss performance.
In the end, however, this is only going to give us breathing room. Fundamentally, there is wire latency on the bus and precharge latencies in the memory line drivers that need to be there. We can alter the processes to make these things faster, but we cannot entirely make them go away.
shap
l4-hackers@os.inf.tu-dresden.de