-----Original Message----- From: Rudy Koot [mailto:rudykoot@hotmail.com] Sent: Wednesday, December 31, 2003 11:39 AM
The problem is that they only look at direct costs of cycles spend on entering the kernel, doing IPC and exiting tthe kernel. They also look at indirect costs of TLB and cache misses caused by IPC, but strangely enough won't look at the cost of checking access rights (probalby because access checks are no longer part of the microkernel, but of operating system policy).
That is exactly the point. If you add that feature to the kernel you pay on _every_ system, even if it does not need any security mechansims at all (or only very rudimentary). Hence, moving it to user land eliminates it from the critical path and makes other scenarios faster (without, from my perspective, massively hurting systems which need and want such security models).
If you want to convince them, you have to make sure they eiter count those costs also OR make sure they believe the added IPC costs of capabilities are indeed neglectable. As far as I could reason these added (compared to the thread-id methode) costs would be:
- One extra register spilled on the receiver side (to store
the server defined word)
- One extra memory access (to convert the capability into the
server thread id) [VTO]
- One extra memory access (to load the server defined word)
- One or more extra memory accesses (to locate the server
defined word and server thread id in the Thread Object Space) [VTO]
And you forgot all the TLB entries you need.
When moving it to user land you are able to optimize the lookups by: - clever choosing identifiers (your identifier space is unlimited and can be as small as zero bits) - efficient and combined lookup strategies (e.g. a file descriptor can contain the filepos and security identifiers) - combined calls, i.e. accessing multiple identifiers at once (write to n files at a time, multicast) - local data structures--or do you want to share your cap-tables on NUMA systems between processors? How do you plan to memory-manage those?
- Volkmar
On Wed, 2003-12-31 at 13:19, Volkmar Uhlig wrote:
-----Original Message----- From: Rudy Koot [mailto:rudykoot@hotmail.com] Sent: Wednesday, December 31, 2003 11:39 AM
The problem is that they only look at direct costs of cycles spend on entering the kernel, doing IPC and exiting tthe kernel. They also look at indirect costs of TLB and cache misses caused by IPC, but strangely enough won't look at the cost of checking access rights (probalby because access checks are no longer part of the microkernel, but of operating system policy).
That is exactly the point. If you add that feature to the kernel you pay on _every_ system, even if it does not need any security mechansims at all (or only very rudimentary). Hence, moving it to user land eliminates it from the critical path and makes other scenarios faster (without, from my perspective, massively hurting systems which need and want such security models).
Excuse me, but this is utter nonsense. You are making a quantitative argument using qualitative and unsubstantiated arguments. You (and I) need numbers in order to evaluate this issue.
The correct way to approach this is to ask:
1. What percentage of invocations dynamically incur the cost of checking? 2. What is the respective cost of this checking at user and at supervisor level? 3. Given that the supervisor-mode implementation *is* incurred by all cases, what is the weighted cost of the mechanism in both cases (user and supervisor).
Once you have this calculation, you have a correct engineering basis for making a decision.
You also need to stop over-estimating the kernel cost. :-)
- One extra register spilled on the receiver side (to store
the server defined word)
- One extra memory access (to convert the capability into the
server thread id) [VTO]
- One extra memory access (to load the server defined word)
- One or more extra memory accesses (to locate the server
defined word and server thread id in the Thread Object Space) [VTO]
And you forgot all the TLB entries you need.
As far as I can tell, this TLB cost is entirely driven by thread spaces, and has nothing to do with the presence or absence of an extra word. Please do not attribute to my proposal costs that are necessary in order to provide a correct, secure implementation.
The current L4 implementation is fast because it is insecure. Speed does not justify incorrectness.
There are many possible resolutions to this problem. Thread spaces is one. Capability registers are another. These alternatives have distinct performance profiles, and I am not advocating one over any other. None of these have anything at all to do with the extra word, which has *no* TLB cost and at most one marginal data cache miss.
When moving it to user land you are able to optimize the lookups by:
- clever choosing identifiers (your identifier space is unlimited and
can be as small as zero bits)
- efficient and combined lookup strategies (e.g. a file descriptor can
contain the filepos and security identifiers)
- combined calls, i.e. accessing multiple identifiers at once (write to
n files at a time, multicast)
- local data structures--or do you want to share your cap-tables on NUMA
systems between processors? How do you plan to memory-manage those?
Please consider that "clever" is a four letter word (a curse). All of these optimizations are, in my opinion, justifiable reasons to fire a programmer with a strongly negative recommendation. The resulting systems are unmaintainable.
Speed is not the primary goal. Efficient robustness (measured end to end) is the primary goal.
shap
l4-hackers@os.inf.tu-dresden.de