Similar to Rudy I try to answer multiple mails at once. (Sorry for the long post!!!)
First a general statement. It appears to me that goals are very different. As Rudy pointed out one of the main goals of L4 (at least pursued in Karlsruhe) is to have a universal and high performance uK. Universal here means that it is applicable to any domain. That also answers Rudy's question who is going to use a microkernel without a security model--anybody who does not require a security model. And that may be your wrist watch, cell phone, or a medical appliance.
Jonathan wrote:
Please consider that "clever" is a four letter word (a curse). All of these optimizations are, in my opinion, justifiable reasons to fire a programmer with a strongly negative recommendation. The resulting systems are unmaintainable.
Speed is not the primary goal. Efficient robustness (measured end to end) is the primary goal.
That maybe true for your research agenda. However, to investigate the limits of uK based systems I would be very disappointed if you wouldn't even consider these optimization. Then we are back to the good old MACH problem: uKs are inherently slow by nature, full stop. And, we are talking research here, not business (btw, I would fire those guys too). We may also simply not have the right tools to do that robust and efficiently today, but then we should look into better tools.
Excuse me, but this is utter nonsense. You are making a quantitative argument using qualitative and unsubstantiated arguments. You (and I) need numbers in order to evaluate this issue.
The correct way to approach this is to ask:
- What percentage of invocations dynamically incur the cost of checking?
- What is the respective cost of this checking at user and at supervisor level?
- Given that the supervisor-mode implementation *is* incurred by all cases, what is the weighted cost of the mechanism in both cases (user and supervisor).
Once you have this calculation, you have a correct engineering basis for making a decision.
Good that we are on common ground now. We also have to agree on what system you and I have in mind and from that we can derive the overall costs. To me it seems you are implying a system design and claim that L4's lack of certain features makes it perform as good/bad or worse. However, by not having the feature in the kernel but in user land allows to modify it. Now the question changes to how efficiently can we implement it in user land compared to the in-kernel version. In the case it is part of the kernel you can't change it anymore which means you will always pay the cost whether you want or not. And here you also have to evaluate _all_ potential systems, even those without security requirements.
And I should acknowledge that I disagree with Volkmar very fundamentally about the "performance uber alles" assumption. Performance is important, but from a research perspective it is *more* important to understand what the fundamental architectural issues are. Once we do, we can step back and decide what to take out. At the moment, we have no common base on which direct comparison is possible.
I think Jonathan misunderstood me on that. I think we have a set of core requirements for a kernel which are probably almost identical for EROS and L4. However, we have a different focus. And performance is still one of the main reasons why monolithic kernels are preferred over uKs. That does not mean that security is less important.
I don't think that the L4 kernel has all the answers, but I also think that the EROS kernel does not have all the answers. Each, I think, has important strengths. EROS lacks a certain elegance of minimality. L4 (as I understand it today) lacks a credible story about security,
access
control and denial of service. Perhaps it is time for both groups to step back and make a serious attempt to learn from each other.
I agree that one of the current weaknesses of L4 is its very rudimentary security model. And each L4 group seems to have its own perspective on how this should be solved.
From the conversations on this list, however, it appears to me that the L4 research groups do not agree universally on what the IPC control mechanism is, and this may be contributing to some difficulty in the discussion. One group clearly advocates thread
spaces.
Volkmar (if I understand him) currently advocates IPC indirection. The
two designs have very different implications for performance,
implementation,
and security.
No, I'm not advocating this. So far nobody has a reasonable cost-benefit analysis of the thread mappings and as long as that is the case I'm not in favor of anything. However, considering complexity and elegance of the different alternatives and the performance impact on a wide variety of hardware architectures (specifically MP and NUMA systems), I'm very skeptical about proposals which bind security principals on address spaces. That will just open another can of worms.
The redirection model has the disadvantage of fundamentally changing the semantics of IPC. With a purely synchronous IPC model timing is a fundamental part of the communication protocol. Transparent interception is not possible since the message gets successfully delivered to the redirector, which, however, may discard it. This alters guarantees usually given by the kernel.
Volkmar's technique is quite elegant. The only problem I see is that it places the burden of verification on the wrong party -- it should be on the client. That is, the design invites DoS attacks.
If it does not matter where the check code is executed (client/kernel/server) but only who gets accounted for the used resources (CPU, cache, whatever) this is not the case. (I'm not claiming L4 can do that (yet).)
However, Volkmar is making an assumption that is problematic. He is assuming that the passed ID is a pointer. From a security perspective this is VERY bad. First, it invites memory attacks. Second, it discloses information about the internal implementation of the
server.
Third, it makes selective rescind of authority very hard to do.
No, it can be an object handle which can be translated in some/any way (e.g. cryptographically modified, or a hash). The fundamental difference is that it is not a first class kernel object. This means you have the freedom to choose the appropriate encoding (which could be a pointer of course).
As a backwards compatibility matter, and as a validation of the L4 nucleus, running Linux on an L4 is interesting. As a matter of future research, and as a matter of forward-looking system architecture, it is boring.
Yes, I agree. However, a gradually decomposed Linux can answer many research questions without re-implementing a fully fledged OS.
The right forward-looking research question is:
*Given* a fast nucleus in the style of L4, what is the most effective way to structure a native operating environment? What fundamental and novel leverage do such kernels provide?
That was one of the goals of SawMill.
==== Rudy wrote:
Sorry, but I do not see how the version part could be used as the server defined word. You can't have multiple versions (read server defined words) of the same thread-id at the same time.
Correct. I had a different usage scenario in mind--sorry for confusions.
I believe these costs are exagerated. To convert a capability into sender-id a simple table lookup can be used. This requires a single TLB entry and a single L2 cache line (worst case). If IPC is frequent,
which is the case in which you want high performance, the TLB entry is very likely to be present and probably even the L2 cache line.
This assumes that you have a small TLB working set and small cache working set. However, if communication is frequent that also means that you are probably communicating to _many_ partners. And then you probably need a TLB entry per table which again increases the TLB footprint. That means the likelihood that you replace user-TLB entries goes up--the problem is the user's active working set you replace.
==== Jonathan wrote:
Volkmar is combining two things in one, and seems to be forgetting that he does so. You are correct. Once you have the capability, obtaining the target thread id is quite fast. No table lookup is required. The capability is kernel-protected state, and can simply contain a direct pointer to the recipient PCB.
I think that the costs that Volkmar is identifying come from the need to lookup the thread-id or capability in the thread mapping space. That is, these are the costs of the address space traversal. The costs are the same in either case (thread-id or capability).
If you add a level of indirection (thread id or cap) that is the case. However, if you pay for a level of indirection anyhow, then it does not matter whether the name specifies a capability or a thread. I have the (maybe unreasonable) feeling that it should be possible to express both in one model with the same costs associated.
One of the features of EROS is an indirection object. This object can stand in front of a start capability transparently. The client invokes the indirection object capability rather than the start capability. Given this construction, a service can selectively rescind that client capability by destroying the indirection object.
This is something that L4 presently has no means to do. Of course, L4 does not need it, because it makes no assumption about controlling client invocations at all.
In the current L4 model this would be a proxy thread assuming that IPC is cheap and it happens infrequently. In the case of frequent redirection you would extend the IPC protocol to take the reconfiguration into account. This boils down to how much transparency do we want. Jonathan mentioned on the EROS list that transparent persistency is something which is not feasible. The same question can be asked in this case: is transparent redirection something we want/need? And are we willing to pay the overhead? One could construct a system, where communication can only be restricted (for enforcement) and apps must be able to dynamically re-configure.
==== Jonathan
In practice, I would use a cache for an L1 software probe first in order to avoid all marginal TLB and cache misses in the usual case, but this would result in high variance of IPC times.
Which means you add a new policy to the kernel, something we strictly try to avoid with L4.
- Volkmar