At Mon, 17 Oct 2005 12:55:09 +0200, Marcus Völp voelp@os.inf.tu-dresden.de wrote:
As I said, the fact that we did not include a cmp operation yet is simply because we were not sure to need it. Once you convince us cmp is in, provided it opens no further security leaks.
Unfortunately by now I am almost convinced that even introducing cmp() is insufficient, mostly because of performance and code complexity issues. So I may not be the best person to argue in favor of cmp().
BTW, the protocol you describe doesn't even work, as in L4.sec there is no way for "C" to name "S" in its request to "D". Introducing such names, which would have to be authenticated, is impossible if C and S have a certain asymmetric trust relationship. However, there _are_ protocols which work, and I have worked them out (you can ask me for more details). They all require RPCs in which capabilities are passed as arguments and can be identified by the receiver, for example using a cmp operation.
It would be great if you could provide us these protocols. Actually that is what we are currently searching for to determine whether the proposed architecture will work out.
You can start with my presentation at the LSM in Dijon. I make no apologies for its deficiencies: I am sure there is a lot in there that can embarrass me, so please apply good will liberally :)
You can find slides and an audio recording here:
http://medias.2005.libresoftwaremeeting.org/topics/os/
Porting the GNU/Hurd to the L4 microkernel, by Marcus Brinkmann Abstract [en] - Abstract [fr] - Slides [pdf] - Audio recording [ogg]
I am making a lot of assumptions in this work, many of which may not apply to yours. I am happy to further elaborate on it. This is probably best done in answering specific questions.
Also, this proposal does not address the general case where an operation must be performed on more than one capability, like server side copy operations.
I know this proposal is not complete and we can find various scenarios where it seems not to work. With regard to server side copy we are not quite sure yet whether we need such an operation. But this again assumes some system structure which performs the copy in a server that can savely invoke capabilities.
Certainly we are making a lot of assumptions. After all, we only want to implement one system, and not ten. So we make assumptions that seem appropriate to us for the type of system we want to build. I am happy to explain, and to some extend defend, the assumptions we make, should you find them suspicious.
Note that the copy operation was only an example. We use such an operation in our physical memory server design that implements trusted buffer objects, so we can make logical copies of page frames. I really can't imagine any other way to do this securely but by invoking one operation on two objects implemented by the same server.
The talk above contains another example where an operation needs to be invoked on two objects: The authentication server in our system implements ACL based authentication. The client provides an identifying capability to the server, and the server makes a call to its trusted authentication server handle, providing the clients capability and a capability that should be returned to the client (again, an operation that needs to be invoked on two capabilities at the same time). Note that the server can impose as the client to another server, but this is not a problem as the object is not directly returned, but given to the trusted auth server, where (only) the client can retrieve it.
ad 2: Alternatively the server can prepare to defend against misbehavior of D. In L4.Sec the receiver of an IPC controls the location where an incoming message is placed. Thus it can select an area of its address space so that even if D replies with bogus content, S is not harmed. It remains, however, the problem of blocking S. An easy way to defend against blocking attacks is to fork of a thread for this particular client's request and let it invoke the pot. untrusted capability on behalf of the client. Other thread invoking the same server are not affected by this blocking.
This proposal achieves nothing. First of all, there is no benchmark that tells you when you have waited long enough and the capability should be rejected, for example because the destination blocked too long. But even more seriously, there is no benchmark to decide when the capability is good and its implementation be trusted. Thus, this proposal does not achieve the original goal at all.
It's nice that I can create a new thread to restrict the damage caused by a blocking RPC partner. But it has nothing to do with what you want to achieve in Jonathans example.
I don't agree here. The proposal is to be able to decide on the trust of a capability after invoking it and avoiding the damage when it turned out not to be trusted. This has nothing to do with blocking time but more with a server structure that acts for and on behalf of a client.
Jonathan's example (I assume you refer to the example where the server has to decide whether or not to leak secure data) works by requesting the not yet trusted target of the capability to authenticate itself (e.g., via the kernel protected badge send with the reply IPC) and to leak information only after the knowing that the server is trusted.
I am not sure this is about leaking secure data, and I can't really follow your description of what the server does. Maybe this is just because I don't understand the terms you used in the way you use them. Maybe it helps if I say in my words what I think Jonathan's example achieves and how it does it.
The "sender" sends a capability to the "receiver". Now the objective for the receiver is to establish if it can "trust" this capability. Here this means: The receiver wants to be able to trust the _implementation_ of the capability. Ie, it wants to make sure that this capability is really implemented by the exact trusted system service it wants it to be implemented by.
For this purpose, the receiver can invoke a capability to the trusted system server which it already holds (for example, which it was given at process creation time), passing the "unknown" capability as an argument. _If_ the trusted system server has created the capability, it can now identify it via a kernel protected badge, and send a positive or negative reply in the response message.
Note that _not_ the "not yet trusted target" is asked to authenticate itself. Rather, an already trusted target is asked to identify the not yet trusted target by means of the identify operation, which is invoked on the server side.
Such an identify operation is also mentioned as an example in my talk. In fact, it is the basis for the reference monitor as well as the authentication server.
Note however that in my talk I often shortcut the process: The following two are analogous:
1. You first identify the untrusted capability as a trusted capability using a known trusted capability, then you perform an operation on it.
This is how it is done in EROS.
2. You combine both operations into a single one that is invoked on the known trusted capability and takes the untrusted capability as an argument.
This is how it is done in the Hurd (so far).
Here maybe you will find something reasonable: In the Hurd design we did, we never actually "use" the previously untrusted capability directly. We only use it as an authentication handle. This is merely a syntactical difference, though. The design pattern is exactly the same.
- Is the design pattern actually desirable?
I don't know yet. We did not yet design / build a system of a scale which would allow us to answer this question. We are right now looking into such a system and we would welcome any contributions and a detailed understanding of your envisaged system.
Fair enough. I developed the protocols from scratch (in cooperation with Neal Walfield), based on what we wanted to do in the Hurd. On this basis alone I would have little confidence that we are on the right track. But in hindsight I found similar design patterns in Shapiros work (EROS), which has a much better analytical foundation than our work. I submit the emergence of the same design pattern in two independent systems as one piece of evidence for its usefulness :)
- If yes, can it be _efficiently_ supported by the L4 architecture
with a reasonable effort?
Since the answer to the first question is a don't know we should not talk about efficiency yet.
From that perspective, I agree.
If a system structure cannot efficiently be supported by L4 we need to understand why and if it turns out to be the kernel which causes the inefficiency we will change it. Do you have any numbers for your architecture where you find L4 to be the bottleneck (assuming the functions you need were added).
No, because we halted the implementation for a reevaluation of the design decisions we made. The major reason we believe that performance will be inappropriate is at this point not primarily the cmp operation, but it is the lack of a copy operation for mappings. If you look at my protocols, this imposes an additional IPCs and system calls in the RPC path for every capability that should be copied from one process to another. As capability copy is expected to be ubiquituous, this is a discouraging result.
If I would assume that a copy operation is implemented, and a cmp operation is available, then that might be an opportunity to have another look. However, from a purely aesthetical design-intuition point of view, cmp() looks like a band-aid to me. There are two concerns and then a bit: To actually use cmp() efficiently, the client needs to supply a guess as to for which object it has supplied the capability. Otherwise the server can't find it in O(1). Not a big deal, but its cumbersome and increases the message size, and makes multiplexing more complex. Also, Espen argued convincingly that you really want to have the translation happen on the IPC path, and not after receiving the message, to reduce system call overhead. Because of this I found the ID objects Espen proposed a more convincing approach. However, the devil was in the detail, and there were a couple of important optimizations and small feature extensions required to make it all fit together (for example for efficient "return capabilities" we would want a new map item which would auto-destruct on the next mapping---things like that).
To accept any of this however, you need to come a certain way along with us, accepting some of our assumptions, at least preliminary. In return, I would easgerly listen to alternative proposals that don't give up on the general goal (a protected capability system that allows to implement secure operating systems).
It's some months since I looked at all this the last time, but it's all slowly coming back to me as if it was yesterday. Now I'll shut up and wait for your questions. :)
Thanks, Marcus