Hi,
Last week when Neal was visiting Dresden, he explained how rights amplification (i.e. passing capabilities as arguments to capability object invocation) doesn't work with the current L4 capability framework. The specific example he gave was that of a memory server which provides so-called memory containers and the ability to copy memory between memory containers using something like:
container_copy (A, a_start, B, b_start, length)
where A and B are capabilities/end points presumably to the memory containers.
To perform a container_copy, the client would send an IPC to the capability/end point A and map B to the server. The server is told the id of the receive end of the end point A and thus can easily find the server object associated with A, however, it has no way to derive the receive id of B (even though it serves B) and thus has no way to find the associated server object.
The following discussion presents the problem more generally, shows that the above is a special case of a larger problem and presents a solution which minimally modifies the current L4 capability proposal. The main purpose of the suggested solution is to further illustrate the semantics that we consider necessary, and to start a discussion.
Rights Amplification and future L4 designs (Marcus Brinkmann, Neal Walfield) ============================================================================
Not much can be said about a capability system without first determining how capabilities are represented. The current proposals assume that server objects will be represented by communication end points (more precisely, by the receive end). The send sides of the end points then assume the role of the capability. This is not the only solution. Other primary (kernel) objects following the map/grant/unmap semantics could be used to represent objects and capabilities.
Within this framework, the only directly supported operations on such capabilities are map, grant, unmap and IPC. The map operation can be used to delegate the capability, grant can be used to move the capability and unmap to revoke it. The IPC operation allows clients to send messages to an object, for example, to invoke operations on the object.
The server can identify the object only when it receives a message on it: on message receipt, the receive side of the end point is presented to the receiving thread. There exists (by assumption) a one-to-one relationship in the server between receive sides of end points and server objects. Thus, given the receive end point, the server can determine to which object any given message was sent.
(Note that if you do not use communication end points as object identifiers via a 1:1 relationship, determining to which object a message should go is an open question whose answer depends on how objects identifiers are represented in the first place.)
The problem for us is that identifying the object only in the server and only when a message is sent to the object is not sufficient to implement the system we have in mind. In general, we need to be able to know if two capabilities (that is, mappings), are, from a certain perspective, "identical". Identical in this context means that one of the mappings is an ancestor of the other mapping, i.e. that both are associated with the same communcation end point. For our purposes, we don't need to compare _any_ two mappings to the same communication end point; we only need to be able to compare two mappings in the case that one is an ancestor of the other. The following examples will make it clearer.
Usage scenario 1: Operations on multiple objects.
Consider a server providing multiple objects of the same or different types and operations which require access to more than one of the objects at the same time, i.e. right amplification. The classic example is that of the can and the can opener: an operation such as "open_can (can_opener_t opener, can_t can);" or equivalently "open_can (can_t can, can_opener_t opener);" can only be invoked on either of the two objects but requires capabilities to both. The second object is passed as an argument in the RPC.
The server receives the message on one of the two objects which it can readily identify but is now in the peculiar situation of needing to identify the other object. Within the proposed framework, the only low-level mechanism that seems feasible at all here is that the caller _maps_ the second object to the server so as to prove that it has the capability in question. This proof is without meaning if the server cannot establish that this mapped capability is indeed associated with an object it provides; the server must be able to lookup the object based on the mapping from the caller.
Usage scenario 2: Reference counting.
Modern microkernel based operating systems usually ask for explicit creation and destruction of resources, and rightly so. But at least for legacy support it must be possible to implement reference counting on top of them. Because references must be deallocated automatically on task death, the support for this must be integrated---at least to some extent---into the trusted computing base.
Within the initial assumptions, a simple design for a reference counting server readily suggests itself: a reference counting server positions itself---mapping-wise---between a server and its clients. The servers map capabilities to the reference counter which in turn maps them to clients doing the necessary accounting along the way.
There are a number of protocols which can be used to copy references from one task to another, etc. The following situation arises when copying a reference from a client, A, to a second client, B:
A server, S, has mapped a capability (to a server object) to the reference counter C which has in turn mapped the capability to A. A then maps the capability to B thereby delegating it. B now wants to acquire its own reference and mapping from the reference counter in order to become independent of A.
Situation: S -> C -> (1 reference) A -> B
Goal: /-> (1 reference) A S-> C -> (1 reference) B
So, B makes an RPC to C:
insert_reference (reference_space_t space, cap_t cap, cap_t *new_cap);
(The reference space is an object provided by the reference counter and is a name space for all the capabilities to which B has references.) What is important here is that this operation takes a capability as an argument. In this way, B proves that it has access to the capability it got from A. The reference counter will reply with a new mapping that B can use which is independent of the mapping that it got from A - if B then unmaps the mapping it got from A, the situation is symmetrical and the goal accomplished.
Shifting perspective from B to the reference counter, the reference counter is in a peculiar position: it is _not_ the server providing the capability CAP. Also, it didn't give a mapping for the capability to client B previously. So it is in a totally different position than the can opener server above: the reference counter sees only send sides, not receive sides, of the communication end points for which it manages references and furthermore it receives capabilities as arguments from tasks to which it didn't previously give mappings.
However, there is still some structure left: the reference counter is a mapper of the capability CAP itself: it has a mapping of this capability which it got directly from the server. In addition, the mapping CAP is derived from this mapping in the reference counter itself via A. This is what we want to exploit: we want a way for the reference counter to find the mapping it got from the server for this object based on the mapping it got from B for the same object.
The solution:
Finding a solution to usage scenario 1 is not too difficult. One approach is protected payloads (a data word chosen by the server) which can be associated with the receive side of a communication end point and which can be read out based on the send side (the mapping, aka capability) if you are sufficiently privileged (i.e., if the receive side resides in the same address space). The protected payload can be set to an internal OID.
This solution is inadequate for the second usage scenario: the thread doing the lookup (in the reference counter) did not create the communication end point and thus had no say in the protected payload. Moreover, it shouldn't be able to read it out even if there was one. So, the protected payload idea doesn't seem sufficient.
There seems to be a tangible solution though: if there was a system call which allowed the caller to check if a mapping was derived from another mapping in the same address space, then we can use that to "unroll" mapping loops like the one in the first scenario, i.e. Server -> Client -> Server, or in the second, i.e. Server -> Reference Counter -> Client A -> Client B -> Reference Counter. More specifically:
Each mapping is identified in the mapper's address space via a virtual address (or index, or slot number, ...). Then there could be a kernel system call
vaddr_t map_lookup (vaddr_t addr);
which walks up the mapping tree, starting at the mapping entry for ADDR and going to the parent of the current node in each iteration until a mapping in the same address space is encountered. The vaddr of this mapping is returned.
Solution in usage scenario 1:
Say the send side of the communication point created in the server for the object is located at the vaddr 10. The server records this information in some sort of table. Then the capability is mapped to the client. The client maps it back into the server as part of an RPC. This mapping from the client is installed into the server at vaddr 67. Now the server does:
map_lookup (67) -> 10
The map_lookup in the kernel starts from the mapping the server has to the client. It then goes to the parent of that mapping which is the mapping B has from the server. B has a different address space from the server, so the kernel doesn't stop there and continues to the next parent. The next parent is the mapping the server has from the kernel (the communication end point). As address spaces match, the vaddr of this mapping, 10, is returned.
The server can then look up the object data for the object associated with the vaddr 10.
Solution in usage scenario 2:
The server creates a communication end point and maps it to the reference counter which receives it, at say, vaddr 23. The reference counter writes in a table that vaddr 23 corresponds to this server's capability. Now A gets a mapping, delegates it to B, which in turn delegates it to the reference counter which receives it at, for example, vaddr 127. The reference counter does:
map_lookup (127) -> 23
The map_lookup operation in the kernel starts with the mapping the reference counter has from the client B. It then walks up the tree first finding the mapping client B has from A, then the mapping client A has from the reference counter and eventually the mapping the reference counter has from the server. The reference counter address space is found and thus the vaddr of this mapping from the server is returned to the reference counter.
The reference counter can then record a reference and create a mapping from the mapping "23" to the client B thus creating a symmetrical situation between client B and A.
Conclusion ----------
The operation map_lookup achieves our goal of being able to implement the desired functionality. Note that it does not require the kernel to search the whole mapping tree: the tree is only walked from the current node up until the root; it is never walked down or sideways. This means that, for example, if the reference counter has multiple mappings from the server for the same communication point, it will see them all as being distinct (at least it can not establish the identity). It also means that if B gets a direct mapping from the server without going through the reference counter, then the reference counter will not be able to identify this capability as belonging to the same communication point as the mapping it already has.
All of these limitations are perfectly acceptable and indeed beneficial. We don't want global identification; we just want to be able to detect mappings _back_ to us which somehow originated from us.
I said this should be a system call but in fact that is an implementation detail. It may just as well be done via some other mechanism, for example, in transit during an IPC, if requested. That would actually potentially help.
It's unfortunate that the tree has to be walked. Information that is associated strictly with the receiving side ("protected payloads" above) is faster to lookup and has a fixed cost, while walking a tree to the root grows with the depth of the tree. For a negative reply, the whole depth of the tree up to the root has to be walked. In common usage scenarios, however, the tree will not be very deep. I am no expert on mapping databases, so maybe playing around with the data structures can further optimize this path (for example, keeping linked lists per address space in the mapping tree would allow fast lookup of anchestors within the same address space, but my feeling is that maintaining such lists would be more expensive than doing the lookup when it is needed only).
There may be security implications in revealing mapping "identities" at all. Maybe this feature needs to be restricted for confined tasks. However, some sort of rights amplification is essential and it is OK to ask for the cooperation of all tasks involved in determining the identity. If for example client A in the reference counter example can deny the map_lookup through its branch of the mapping tree then the reference counter will simply deny its service. This is perfectly acceptable to me; the important thing is that it must not be possible to trick some other task in believing an identity that doesn't exist: to hide identities is fine.
Hi Marcus,
Usage scenario 2: Reference counting.
the main problem with reference counting is that the clients have to explicitly release the reference. Thus cooperation is needed, since L4 do not send a notification if an object e.g. a task is destroyed...
Situation: S -> C -> (1 reference) A -> B
Goal: /-> (1 reference) A S-> C -> (1 reference) B
In your scenario both clients A and B have to cooperate with C for the release notification. Since both cooperate anyway, the client A could easily ask C to map a new reference to B which makes the map-lookup unnecessary...
Bernhard
At Fri, 10 Jun 2005 15:15:09 +0200, Bernhard Kauer wrote:
Usage scenario 2: Reference counting.
the main problem with reference counting is that the clients have to explicitly release the reference. Thus cooperation is needed, since L4 do not send a notification if an object e.g. a task is destroyed...
Clients can voluntarily release a reference, however, they are not required to do so. The task server, which is part of the TCB, knows when every task terminates. It can provide this information to the reference monitor.
Situation: S -> C -> (1 reference) A -> B
Goal: /-> (1 reference) A S-> C -> (1 reference) B
In your scenario both clients A and B have to cooperate with C
C needn't trust either A or B.
Thanks, Neal
On Fri, Jun 10, 2005 at 02:23:50PM +0100, Neal H. Walfield wrote:
Usage scenario 2: Reference counting.
the main problem with reference counting is that the clients have to explicitly release the reference. Thus cooperation is needed, since L4 do not send a notification if an object e.g. a task is destroyed...
Clients can voluntarily release a reference, however, they are not required to do so. The task server, which is part of the TCB, knows when every task terminates. It can provide this information to the reference monitor.
There is a grant problem. If a client X grant an object to Y and X dies, this does not mean, that the reference to the object is released...
Situation: S -> C -> (1 reference) A -> B
Goal: /-> (1 reference) A S-> C -> (1 reference) B
In your scenario both clients A and B have to cooperate with C
C needn't trust either A or B.
If client A asks the server C to map something it already has, from C to a client B, only the clients have to trust C to provide this service. The server C needn't trust its clients for this operation...
Bernhard
At Fri, 10 Jun 2005 15:38:27 +0200, Bernhard Kauer wrote:
There is a grant problem. If a client X grant an object to Y and X dies, this does not mean, that the reference to the object is released...
Of course it does, X died and as a result the reference monitor gets a task death notification. If Y required the object beyond X's death, it should have gotten its own reference but that is a different problem.
Situation: S -> C -> (1 reference) A -> B
Goal: /-> (1 reference) A S-> C -> (1 reference) B
In your scenario both clients A and B have to cooperate with C
C needn't trust either A or B.
If client A asks the server C to map something it already has, from C to a client B, only the clients have to trust C to provide this service. The server C needn't trust its clients for this operation...
Right, that's the point. C is part of A and B's TCB; C does not trust either A or B.
Hi,
some additional notes from me.
Within this framework, the only directly supported operations on such capabilities are map, grant, unmap and IPC. The map operation can be used to delegate the capability, grant can be used to move the capability and unmap to revoke it. The IPC operation allows clients to send messages to an object, for example, to invoke operations on the object.
I think here is a difference to our view on capabilities: The IPC operation allows to send a message through an endpoint to a server. The server could somehow identify the sender of a message.
Using this sender identification as object reference, where the server invokes an operation on, is possible only if a single object reference is needed. As mentioned by Neal last week this does not work for multiple references.
By using the sender id as an reference for a user, this problem is gone.
To demonstrate the difference between these two attempts, look at a simple fileserver. If only read/write operations are needed the sender id could be the file number. But with an operation which needs multiple files like copying between two files, the sender id should identify only the user, but not the file anymore.
Bernhard
At Fri, 10 Jun 2005 16:55:05 +0200, Bernhard Kauer wrote:
Within this framework, the only directly supported operations on such capabilities are map, grant, unmap and IPC. The map operation can be used to delegate the capability, grant can be used to move the capability and unmap to revoke it. The IPC operation allows clients to send messages to an object, for example, to invoke operations on the object.
I think here is a difference to our view on capabilities: The IPC operation allows to send a message through an endpoint to a server. The server could somehow identify the sender of a message.
As we understand it, the next gneration L4 API will not offer first class capability objects. The assumption has been that capabilities will be built on top of end points. So, I think we agree: there is no fundamental equivalence between end points and capabilities and the question at hand is: how do we represent capabilities? I think we made this clean in the introduction to the original email:
Not much can be said about a capability system without first determining how capabilities are represented. The current proposals assume that server objects will be represented by communication end points (more precisely, by the receive end). The send sides of the end points then assume the role of the capability. This is not the only solution. Other primary (kernel) objects following the map/grant/unmap semantics could be used to represent objects and capabilities.
The server could somehow identify the sender of a message.
Are you suggesting that we use the task id of the sender as the key?
The problem with *simply* using the sender of the message as identification (i.e. no object key) is that it assumes that the sender is the principle. This doesn't need to be the case. Consider a trusted interpreter: it can serve many mutually untrusting clients from the same thread. If the server only uses the task id of the sender to find the principle, the interpreter will have to implement its own security architecture. This is why ambient authority does not make a good security architecture: this policy unnecessarily restricts the type of authority principles one can have. Moreover, exposing the sender to the receiver violates the principle of least information (i.e. only expose what is necessary and nothing more) and it also makes copying access to an object more difficult and requires a cooperative server.
It is possible to build a capability system using sender ids: the server becomes a capability server and provides object IDs to clients. We (primarily Marcus) did it. The resulting system is slow. The reason? The trust issues involved make the protocols very hairy. We finally abandoned the system when Marcus realized that interposing proxies (a feature we highly desire) was far too complicated: the server would need to accept capabilities from (i.e. block on) untrusted clients.
Using this sender identification as object reference, where the server invokes an operation on, is possible only if a single object reference is needed. As mentioned by Neal last week this does not work for multiple references.
I am sorry, I don't remember saying that or maybe I just don't understand what you mean.
By using the sender id as an reference for a user, this problem is gone.
To demonstrate the difference between these two attempts, look at a simple fileserver. If only read/write operations are needed the sender id could be the file number. But with an operation which needs multiple files like copying between two files, the sender id should identify only the user, but not the file anymore.
I am sorry, I don't completely understand this example. It sounds to me that you are suggesting is ACLs? Is that right?
Thanks, Neal
Hi,
we had here a long discussion about a "real capability" or "user capability" system on top of L4.sec and found that we had some proplems with the initial assumption:
The assumption has been that capabilities will be built on top of end points. So, I think we agree: there is no fundamental equivalence between end points and capabilities and the question at hand is: how do we represent capabilities?
Using an endpoint for every "user capability" is quite inefficient. This one-to-one mapping prevents to use the advantages of L4.sec and lead to additional kernel operations like cmp()...
The server could somehow identify the sender of a message.
Are you suggesting that we use the task id of the sender as the key?
No. L4.sec has no global task id's anymore. Instead identification can be done through a mechanism which is called "badge" or "sender ID". It is similar to what you call "secure payload" with the small, but nice exception that it could be set by everyone, but only once [1]. The "sender ID" or the "badge" of the sender is transfered to the receiver as part of an IPC.
The "badge" has therefore the following properties:
1. the creator of the endpoint or a server which is trusted by the creator could freely set the badge 2. the badge is a local name; a server could refer to the same endpoint with different badges 3. the kernel protects the integrity of the badge
(Note that if you do not use communication end points as object identifiers via a 1:1 relationship, determining to which object a message should go is an open question whose answer depends on how objects identifiers are represented in the first place.)
A "user capability" could be something like (badge, object nr).
If a server receives such a capability it verifies whether this badge could use this object or it could ask a trusted policy server (which distributes the badges for his _single_ endpoint), whether this operation is allowed or not.
The scenarios we look at could be solved either with badges or need, like the reference counter example, cooperation between already cooperating clients.
Bernhard
[1] In fact it is a bitstring which could be extended while mapping an endpoint to another space.
At Mon, 13 Jun 2005 18:50:49 +0200, Bernhard Kauer kauer@os.inf.tu-dresden.de wrote:
we had here a long discussion about a "real capability" or "user capability" system on top of L4.sec and found that we had some proplems with the initial assumption:
It's sad that you didn't have the discussion on this mailing list. Just presenting us with the results of your discussion deprives us of the reasons and rationales and thought process that lead you to your result. It also made it impossible for us to give input into the discussion at the appropriate stages.
The assumption has been that capabilities will be built on top of end points. So, I think we agree: there is no fundamental equivalence between end points and capabilities and the question at hand is: how do we represent capabilities?
Using an endpoint for every "user capability" is quite inefficient.
Can you elaborate on this assertion? What makes you think so?
We already have good reasons to believe that using "object ids" managed in the server is inefficient. Now, it may be that using primary kernel objects for user capabilities is also inefficient. But the logical conclusion would be that protected capability systems in L4 can _never_ be efficient, which would be an important negative result---and a quite devastating one as such.
This one-to-one mapping prevents to use the advantages of L4.sec
Can you elaborate on this? For me, it is exactly the opposite. Only by using a one-to-one mapping we can draw any advantages from L4.sec. Otherwise, L4.sec is just in the way. It would offer _nothing_ in addition to L4 X.2 that is of any particular relevance to a protected capability design.
What I have said previously is that it is impossible to implement an efficient protected capability system on top of L4 X.2. What you seem to be saying is that this will continue to be true for L4.sec.
and lead to additional kernel operations like cmp()...
This is true, but what I am saying is that a protected capability system is not feasible without _some sort_ of kernel support, ie primary objects for user capabilities.
The server could somehow identify the sender of a message.
Are you suggesting that we use the task id of the sender as the key?
No. L4.sec has no global task id's anymore. Instead identification can be done through a mechanism which is called "badge" or "sender ID". It is similar to what you call "secure payload" with the small, but nice exception that it could be set by everyone, but only once [1]. The "sender ID" or the "badge" of the sender is transfered to the receiver as part of an IPC.
The "badge" has therefore the following properties:
1. the creator of the endpoint or a server which is trusted by the creator could freely set the badge 2. the badge is a local name; a server could refer to the same endpoint with different badges 3. the kernel protects the integrity of the badge
I am not sure I understand how the badges stuff works. I have read L4-X.3-Future.pdf, but it sheds little light on the details.
Is the badge visible by the mapper? If it is not, then badges offer no solution to the multiple object references problem, and instead are _just_ a way to identify the sender of a message.
If badges are visible by the mapper, then I think this feature can be used to implement a map_lookup function in user space. This would be good news indeed. But this is not my impression.
(Note that if you do not use communication end points as object identifiers via a 1:1 relationship, determining to which object a message should go is an open question whose answer depends on how objects identifiers are represented in the first place.)
A "user capability" could be something like (badge, object nr). If a server receives such a capability it verifies whether this badge could use this object
We are turning in circles. This is exactly the same model as you suggested in your first reply. And it is exactly the model I have already tried to implement, and that we rejected because of performance, wrong security properties, impossibility of transparent interposition, code complexity, etc.
or it could ask a trusted policy server (which distributes the badges for his _single_ endpoint), whether this operation is allowed or not.
This means one additional RPC (and context switch) per RPC, which is inacceptable.
The scenarios we look at could be solved either with badges or need, like the reference counter example, cooperation between already cooperating clients.
I don't feel we have made any progress here. You just repeated what you said already, but now it is "badges" instead of "sender id". In other words, you have shown how you can determine the sender id given the badges feature. But that is not at all what my mail was about.
Please reread my first reply to you. Everything I say there is still relevant, and you haven't addressed any of my concerns that implementing a protected capability system by either
1) object ids managed in the server 2) object ids managed by a trusted system service
is too slow (2, 1) or too complex and too insecure (1).
It's no good if you keep asserting that we should use one of those two models, and I keep asserting that this is not feasible. Do you think it will help if I explain further why those two models are fundamentally flawed?
I think it is trivial to see that the second model is flawed, just based on the number of RPCs you have to do for verification. It is the "redirector" model all around.
To see that the first model is flawed is more complex, because you have a lot of design parameters. But here is one example which shows the dramatic consequences:
1. You can not delegate, by a simple mapping, access to individual objects, because you would have to delegate the client id, and thus grant access to all your objects to a single server.
This also means that revoking access rights is difficult, unless you keep a whole mapping database in each server.
2. To copy capabilities, both clients involved must talk to the server, and to each other. That means copying capabilities requires at the very least three RPCs. And this assumes that the server already knows the receiving client. If he doesn't, first a communication channel has to be established (a "sender id" for the second client). This is a complex task, with a lot of race conditions to take into account.
3. Copying references always involves the server knowing about. So the server always knows exactly which clients hold references. In some security models, this is not acceptable. It leaks too much information to the server (Shapiro specifically pointed this out).
4. Because the receiver has to make a blocking call to the server to receive the reference, the server must be in the receivers trusted computing base. This means that you can only accept a capability if you trust the server providing it. This however has a severe impact on the operating system design and makes it very hard or impossible to implement certain features in certain systems.
5. Specifically, the receiver has to check if the server is within its trusted computing base. On L4 X.2, you can do it by checking the global thread id of the server. In L4.sec, this will require reintroducing global server IDs via a global name service, furthermore complicating the capability copy protocol and adding a couple of RPCs to communicate with the name server.
6. Furthermore, transparent interposition is hard: If the receiver needs to check if the server providing the capability is in its trusted computing base, but it only directly trusts a proxy of this server, it needs to make an upcall to the proxy to check if the server that the sender of the capability is talking about belongs to the trusted computing base.
So let's say a client wants to copy a capability to another client, which does use a proxy server to the server providing the capability, and which doesn't trust the client sending the capability, so it needs to verify that the server belongs to its trusted computing base. Furthermore say that this receiving client is a new client.
Then you are talking about 3 RPCs for the cap copy protocol + at least 3 RPCs to initiate a new connection with this new client, plus 1 RPC from the receiver to the proxy server to check if the server is in the trusted computing base (one more RPC to the nameserver).
We are now talking about at least 3 RPCs for the most simple case, and 8 RPCs, that is 16 messages, to copy _one_ capability in a relatively simple example involving only _one_ proxy server. Not to talk about the code complexity, all the possible race conditions in a multi-threaded environment, etc.
I am not sure if you really realize what it entails if you say that servers should identify user capabilities via "(sender id, object id)". It's such a small thing to say, but have you actually tried to implement it in a real-world multi-server operating system? I'd like to see working examples.
Thanks, Marcus
At Mon, 13 Jun 2005 20:01:41 +0200, Marcus Brinkmann@ruhr-uni-bochum de wrote:
If badges are visible by the mapper, then I think this feature can be used to implement a map_lookup function in user space. This would be good news indeed.
On second sight, this appears to be not true. Even if badges can be read out by the mapper (which I don't think is true), they could always be forged. As long as we don't have any guarantee that the badge is that of a mapping which is mapped _back to us_, we can't be sure that the prefix we see is genuinely from us. A user could create any badge on a receive point he created himself.
So, badges can only ever be useful to identify receive points, not send points. Even if badges where visible by the mapper, they would be useless.
In fact, I don't see what the badges are all about, anyway. But then, I am not here to criticize the badges feature. The only thing that is relevant to point out here is that the badges feature won't do anything for us that we didn't took for granted before (ie, that we can somehow identify the sender of a message).
I certainly am interested to learn more about the badges feature, particularly why it was introduced at all. But it doesn't seem to be of any concern to the actual issue at hand.
Thanks, Marcus
We are turning in circles. This is exactly the same model as you suggested in your first reply. And it is exactly the model I have already tried to implement, and that we rejected because of performance, wrong security properties, impossibility of transparent interposition, code complexity, etc.
I forgot to explain why I switched back to the initial model. We have here 2 models in our discussion:
1. Using a 1:1 mapping between objects and endpoints. This requires a cmp() function.
2. Using the features of L4.sec (local names, endpoints and badges) to implement a capability system in user-level.
We all agree that the first one can be build, looking from a functional point of view. This does not mean that it is the best solution.
Or in other words some disadvantages (waste kernel memory, needs an additional kernel operation,...) leads to the question whether the second model could also be built.
Perhaps we should split the discussion here and try to answer in one thread the question of the first model (e.g. why cmp() and not map_lookup(),...) and in another one the problems with the second model you mention in your last mail.
Bernhard
At Tue, 14 Jun 2005 11:10:22 +0200, Bernhard Kauer kauer@os.inf.tu-dresden.de wrote:
We are turning in circles. This is exactly the same model as you suggested in your first reply. And it is exactly the model I have already tried to implement, and that we rejected because of performance, wrong security properties, impossibility of transparent interposition, code complexity, etc.
I forgot to explain why I switched back to the initial model. We have here 2 models in our discussion:
Using a 1:1 mapping between objects and endpoints. This requires a cmp() function.
Using the features of L4.sec (local names, endpoints and badges) to implement a capability system in user-level.
I think that is a fair representation of the two branches of the discussion.
We all agree that the first one can be build, looking from a functional point of view. This does not mean that it is the best solution.
Or in other words some disadvantages (waste kernel memory, needs an additional kernel operation,...) leads to the question whether the second model could also be built.
It was my understanding that it is an essential part of the design roadmap for both of the two upcoming L4 designs that some kernel memory is managed in user space.
You already make the concession that users can allocate kernel memory by creating communication end points. It doesn't make much sense to me to now say that this should be restricted to a very small number of such allocations per task. In either case this is something that should be regulated by user space policy, and not by the kernel. Otherwise this indicates again a defect (of a different type) in your design.
L4 X.2 had only one way for a user to allocate kernel memory, by sending a mapping from one address space to another, and that could be restricted by using a redirector. So it was quite safe, although there was still a defect: There was no way for a memory manager to reliably know how much kernel memory was needed for a given users mappings. So, there is some argument to be made that even without L4.sec the memory management model of the L4 kernel is underdeveloped.
In fact, I think that to become a mature product, L4 must solve the kernel memory problem in a fundamental way, because in a production system having the kernel crash because it is out of memory is a major issue.
The need for an additional kernel operation should concern you. But it is my opinion that this is in fact a clue that there is a whole area of endovement in which L4 is underdesigned, and a great opportunity for the L4 group to explore. But of course I first need to convince you that the alternative is not an option ;)
Perhaps we should split the discussion here and try to answer in one thread the question of the first model (e.g. why cmp() and not map_lookup(),...) and in another one the problems with the second model you mention in your last mail.
I would be more than happy with that.
Thanks, Marcus
We have here 2 models in our discussion:
Using a 1:1 mapping between objects and endpoints. This requires a cmp() function.
Using the features of L4.sec (local names, endpoints and badges) to implement a capability system in user-level.
I think that is a fair representation of the two branches of the discussion.
So I split the discussion here in two parts. Because we discussed the first model here in DD today, I will start with them. Perhaps I could write some ideas of the second model in the next days....
We use the following copy()-example in our discussion:
A file server implements file-objects and distributes capabilities of them. It offers, for an atomic copy of a block from one file to another, a copy(src, dst) operation. Since both parameters are given by an client in one call, we have the multi-reference problem.
Are there other examples that cause problems?
Beside the need for a cmp() we found in the discussion that this operation needs to be limited. A transparent interpose of different endpoints with a single one is otherwise not possible. Bounding this "cmp()-right" to the receive right of an endpoint seems feasible.
So in summary I can say, we heavily think about, whether we extend our model with a cmp() operation...
Bernhard
At Tue, 14 Jun 2005 18:22:54 +0200, Bernhard Kauer kauer@os.inf.tu-dresden.de wrote:
We use the following copy()-example in our discussion:
A file server implements file-objects and distributes capabilities of them. It offers, for an atomic copy of a block from one file to another, a copy(src, dst) operation. Since both parameters are given by an client in one call, we have the multi-reference problem.
This example would have the same problem, but I don't think that anybody would design a filesystem that way. Just as a side note.
Are there other examples that cause problems?
How many examples do you want before the mail gets too long? I have given two examples in my original mail. The first example is similar to yours above. The second example was the reference counter, which is the more important one! The above example you give is just the basic example, while the reference counter shows the bigger problem.
Beside the need for a cmp() we found in the discussion that this operation needs to be limited.
I offered a way to limit my map_lookup() functionality.
A transparent interpose of different endpoints with a single one is otherwise not possible.
This just shows that reintroducing global IDs through the backdoor is ill-advised. My recursive lookup approach has the desired locality properties.
Bounding this "cmp()-right" to the receive right of an endpoint seems feasible.
I explained why it is not sufficient in my earlier mails. Specifically, I showed how reference counting would not be possible, or introduce a heavy cost: Object management in a trusted service, which requires one additional RPC per message just to identify the objects, including the primary one on which the message was invoked.
It is important for us that reference counting does not impose further costs on the RPC path. I hope it is clear why.
So in summary I can say, we heavily think about, whether we extend our model with a cmp() operation...
I hope you don't limit the analysis to the cmp operation at this early stage.
Thanks, Marcus
The second example was the reference counter, which is the more important one! The above example you give is just the basic example, while the reference counter shows the bigger problem.
No, the answer for the reference counter problem is simple: cooperation. Just as reminder:
Situation: S -> C -> (1 reference) A -> B
Goal: /-> (1 reference) A S-> C -> (1 reference) B
1. In the start situation A is trusted by B to provide the endpoint to S. Since A could unmap this endpoint everytime.
2. Therefore B can ask A for a new reference. Since A can not provide this service, it asks C and attaches a [1] return endpoint to B in its message.
3. C answers directly to B and maps them a new reference.
A transparent interpose of different endpoints with a single one is otherwise not possible.
This just shows that reintroducing global IDs through the backdoor is ill-advised.
What are the global IDs? We do not have one.
I mean here that if everybody can do a cmp() it is possible to detect, whether two "capabilities" pointing to endpoints, which are interposed e.g. by a monitor, are the same or not.
Bernhard
[1] the endpoint where C send its reply to the RPC
At Tue, 14 Jun 2005 20:33:37 +0200, Bernhard Kauer kauer@os.inf.tu-dresden.de wrote:
The second example was the reference counter, which is the more important one! The above example you give is just the basic example, while the reference counter shows the bigger problem.
No, the answer for the reference counter problem is simple: cooperation.
Just as reminder:
Situation: S -> C -> (1 reference) A -> B
Goal: /-> (1 reference) A S-> C -> (1 reference) B
In the start situation A is trusted by B to provide the endpoint to S. Since A could unmap this endpoint everytime.
Therefore B can ask A for a new reference. Since A can not provide this service, it asks C and attaches a [1] return endpoint to B in its message.
C answers directly to B and maps them a new reference.
This protocol requires that the receiver of the capability, in this case B, makes a blocking call to the sender, in this case A. But in many cases B does not trust A enough to block indefinitely until A does the right thing. For example, in the case where a client wants to submit a capability reference to a server (let's say a name server).
So, this protocol requires too much cooperation/trust.
A transparent interpose of different endpoints with a single one is otherwise not possible.
This just shows that reintroducing global IDs through the backdoor is ill-advised.
What are the global IDs? We do not have one.
Well, if you restrict the cmp() operation to the holder of the receive right, than indeed there are no global IDs. I did not make the distinction clear, sorry. But this means that you can not identify capabilities you don't provide (hold the receive right for), and I (still) consider this to be insufficient.
If cmp() is unrestricted, it is possible to make distinctions between capabilities on a global scale (which means you could assign IDs to capabilities which are globally meaningful, which tantamounts to having global IDs, even if there are no actual IDs assigned). I think we agree it is an undesirable side effect; this has lead you to the conclusion that cmp() must be restricted, thereby making it less useful.
Thanks, Marcus
Just as reminder:
Situation: S -> C -> (1 reference) A -> B
Goal: /-> (1 reference) A S-> C -> (1 reference) B
In the start situation A is trusted by B to provide the endpoint to S. Since A could unmap this endpoint everytime.
Therefore B can ask A for a new reference. Since A can not provide this service, it asks C and attaches a [1] return endpoint to B in its message.
C answers directly to B and maps them a new reference.
This protocol requires that the receiver of the capability, in this case B, makes a blocking call to the sender, in this case A. But in many cases B does not trust A enough to block indefinitely until A does the right thing.
To get into the start situation B has to receive a mapping from A. If this mapping was done, as an result of an RPC (because B requested the mapping) B blocked on A.
If A send this capability within a request to B, it could easily ask C to send this request.
For example, in the case where a client wants to submit a capability reference to a server (let's say a name server).
Assume B is a nameserver and A wants to "upload" a capability reference. Then A calls C, which sends a new reference to B.
Bernhard
At Wed, 15 Jun 2005 11:20:22 +0200, Bernhard Kauer kauer@os.inf.tu-dresden.de wrote:
Just as reminder:
Situation: S -> C -> (1 reference) A -> B
Goal: /-> (1 reference) A S-> C -> (1 reference) B
In the start situation A is trusted by B to provide the endpoint to S. Since A could unmap this endpoint everytime.
Therefore B can ask A for a new reference. Since A can not provide this service, it asks C and attaches a [1] return endpoint to B in its message.
C answers directly to B and maps them a new reference.
This protocol requires that the receiver of the capability, in this case B, makes a blocking call to the sender, in this case A. But in many cases B does not trust A enough to block indefinitely until A does the right thing.
To get into the start situation B has to receive a mapping from A. If this mapping was done, as an result of an RPC (because B requested the mapping) B blocked on A.
Mmmh. Maybe I should have said a few words about what the Hurd is. The Hurd is a multi-server operating system which tries to limit the amount of system code that is enforced on the users. Thus, many services, even system services, are entirely optional (but if you don't use them, certain other features of course become unavailable). But more importantly, users can start their own services, extending/overriding the system services. For examples, users can start their own filesystems, and link them into the filesystem hierarchy.
The Hurd is designed to not require mutual trust in many instances of communication and protocol.
To get back to your protocol: No. B does not need to block on A to receive a message. A needs to block on B to send a message. B just needs to eventually receive from A. If it is a server, it might be busy currently and only eventually enter into an (relatively open) receive.
If A send this capability within a request to B, it could easily ask C to send this request.
It could ask, but C had to decline, because C is a system service not trusting A at all, and definitely won't make blocking calls on arbitrary capabilities provided by users. The part in your protocol where C sends a reply message to B was OK. C can make replies, but it won't make requests.
For example, in the case where a client wants to submit a capability reference to a server (let's say a name server).
Assume B is a nameserver and A wants to "upload" a capability reference. Then A calls C, which sends a new reference to B.
See above. The nameserver is not necessarily trusted by C. If C wants to send the capability to B (which it must, to get the mapping hierarchy right), then it must be as a reply message. Thus B must send the request to C, as C is the trusted system server, not A (which B doesn't trust that much). The required protocol is pretty much enforced from start to end given the border conditions (trust parameters).
Thanks, Marcus
The nameserver is not necessarily trusted by C. If C wants to send the capability to B (which it must, to get the mapping hierarchy right), then it must be as a reply message. Thus B must send the request to C, as C is the trusted system server, not A (which B doesn't trust that much). The required protocol is pretty much enforced from start to end given the border conditions (trust parameters).
There are two ways how B could get its capability. Either B asks C to give them, or A asks C to send one (of the capabilities it has from C) to B.
Both operations work out of the box. A and B do not trust each other and C do not need a cmp().
Bernhard
At Wed, 15 Jun 2005 14:35:06 +0200, Bernhard Kauer kauer@os.inf.tu-dresden.de wrote:
The nameserver is not necessarily trusted by C. If C wants to send the capability to B (which it must, to get the mapping hierarchy right), then it must be as a reply message. Thus B must send the request to C, as C is the trusted system server, not A (which B doesn't trust that much). The required protocol is pretty much enforced from start to end given the border conditions (trust parameters).
There are two ways how B could get its capability. Either B asks C to give them, or A asks C to send one (of the capabilities it has from C) to B.
Both operations work out of the box. A and B do not trust each other and C do not need a cmp().
That's simply not true, and I have explained why. If anything I said was unclear, you should respond to specific points I said and ask for clarification. I have to say I am at a loss how I can explain it further. Here a summary: A can not ask C to send one, because this would either mean C has to make a blocking call to B, or B has to make a blocking call to A _and_ trust A to forward the request to C. B really wants to make sure that it gets its own new mapping of the capability from C, and not from A.
Again: A can not act as an intermediate between B and C, because B does not trust A for that. C can not make a request to B because C does not trust B at all. B must make the request directly to C. It's the only option within the trust boundaries that we set.
The operation where B requests the message from C can not be done without an operation like map_lookup(), because B has to prove to C that it has the capability that A provides, and C needs to identify that capability.
cmp() doesn't even enter the picture here, as it is, I am repeating myself, useless in this scenario.
If you dispute any of that, please point out precisely which step in my reasoning you find wrong.
Thanks, Marcus
There are two ways how B could get its capability. Either B asks C to give them, or A asks C to send one (of the capabilities it has from C) to B.
Both operations work out of the box. A and B do not trust each other and C do not need a cmp().
I have to say I am at a loss how I can explain it further.
Perhaps we could discuss this offline e.g. via IRC.
Here a summary: A can not ask C to send one, because this would either mean C has to make a blocking call to B, or B has to make a blocking call to A _and_ trust A to forward the request to C. B really wants to make sure that it gets its own new mapping of the capability from C, and not from A.
How does it get the capability the first time?
Bernhard
At Wed, 15 Jun 2005 15:53:49 +0200, Bernhard Kauer kauer@os.inf.tu-dresden.de wrote:
Here a summary: A can not ask C to send one, because this would either mean C has to make a blocking call to B, or B has to make a blocking call to A _and_ trust A to forward the request to C. B really wants to make sure that it gets its own new mapping of the capability from C, and not from A.
How does it get the capability the first time?
I assume you mean how B gets the capability from A initially.
Usually A sends it to B via a request.
B could either be blocking on A, if there is mutual trust, for example during the startup of a child process. Or it could be a server in a normal receive loop.
If you are asking how you get any capability initially, then that is a much more complex protocol. I can elaborate on that, but it is probably not very relevant.
Thanks, Marcus
At Tue, 14 Jun 2005 19:15:29 +0200, Marcus Brinkmann@ruhr-uni-bochum de wrote:
Beside the need for a cmp() we found in the discussion that this operation needs to be limited.
Bounding this "cmp()-right" to the receive right of an endpoint seems feasible.
I explained why it is not sufficient in my earlier mails.
Bernhard and me had a short discussion on IRC (thanks a lot, Bernhard!) and quickly found out where I went wrong: I was assuming, incorrectly, that a receive right is bound to a thread. This is true, AFAIK, for the upcoming L4 design in Karlsruhe, but not for L4.sec in Dresden, where receive rights can be mapped to multiple threads.
The difference matters, because if receive rights can be mapped, then yes, indeed, they can be used in the same way as the ID objects Espen was talking about, and they could be created in the global object server and mapped to the individual servers just like them.
So, the functionality the cmp() operation would provide seems to be sufficient. There may be some further factors that require consideration, for example how convenient it is to use, and what the performance impact is. ID objects seem to be simpler and faster for my use case. But that is a different discussion.
So, yes, cmp() in L4.sec seems to deliver the functionality I need.
Thanks, Marcus
At Fri, 10 Jun 2005 16:55:05 +0200, Bernhard Kauer kauer@os.inf.tu-dresden.de wrote:
Within this framework, the only directly supported operations on such capabilities are map, grant, unmap and IPC. The map operation can be used to delegate the capability, grant can be used to move the capability and unmap to revoke it. The IPC operation allows clients to send messages to an object, for example, to invoke operations on the object.
I think here is a difference to our view on capabilities: The IPC operation allows to send a message through an endpoint to a server. The server could somehow identify the sender of a message.
The question here is really if the L4 IPC "capabilities", ie communication end points, can be used to implement "capabilities" of a capability system. The answer of course differs, depending on which requirements you choose for the capability system. We are looking here specifically at a requirement for a certain form of rights amplification, or synergy in general.
You seem to say that this should not even be attempted. But that would be pretty sad, because L4 end points almost have all properties that are needed, and only little needs to be added. And from experience, capabilities without any kernel support just don't work.
By using the sender id as an reference for a user, this problem is gone.
The problem is not gone. Or rather, it is gone like a leg is gone if you amputate it because of a broken toe. Paul Watzlawick calls this "Patt-End" solutions. ;) [1]
This is because you still have to implement a capability system, but now without any kernel support whatsoever. That just doesn't work. It was a pain to do in L4 X.2, but in the upcoming L4 designs, it would be even harder because now you also have to reintroduce global IDs for tasks in userspace, and that is not easy. This would mean that instead of taking advantage of the upcoming security features, we would have to battle and overcome them. That makes little sense to me, in particular if alternatives are pretty attractive from our point of view.
To demonstrate the difference between these two attempts, look at a simple fileserver. If only read/write operations are needed the sender id could be the file number. But with an operation which needs multiple files like copying between two files, the sender id should identify only the user, but not the file anymore.
I am very aware of the differences. We have been there, done that, and it just doesn't lead to a feasible system. It's not that you can't do it at all. It just has all the wrong consequences: You keep compromising one thing over another (performance over security, code simplicity over performance, and so on) and once you look at end-to-end performance, you will actually find Mach interesting again. It's that bad.
Can you show me protected capability systems which are implemented on L4 that work the way you suggest? The "best" I can see is Mungi, and that uses random OIDs, and thus doesn't even enter the picture, because it is not a protected capability system at all. And from what you said, the upcoming L4 design won't change a thing, and just make it harder.
I think that there is an excellent chance that the upcoming security features can be used to build a fast and secure object/capability system. Solving the problem of multiple object references in messages is essential for that. However, I am even more sure that without any kernel support, implementing a competitive and secure capability system on top of L4 is nigh impossible.
I am not sure what further evidence I could give you that the suggestion you make just doesn't work. My own failed attempt and lack of any other attempts in that direction are the reasons I can bring forward. I have tried really, really hard to make it work. It could just be my personal failure. But it could also be a shortcoming of the L4 design.
Maybe L4 is not supposed to support protected capability systems of the type I imagine. But that is not what you are saying---you are saying I should pursue a path I have already walked and found to be miserable. I can only ask you to reconsider my argument, or to be much, much, much more specific about how to do it the way you suggest.
Thanks, Marcus
[1] Solutions which are so good that they not only remove the problem, but everything associated with it.
Within this framework, the only directly supported operations on such capabilities are map, grant, unmap and IPC. The map operation can be used to delegate the capability, grant can be used to move the capability and unmap to revoke it. The IPC operation allows clients to send messages to an object, for example, to invoke operations on the object.
I think here is a difference to our view on capabilities: The IPC operation allows to send a message through an endpoint to a server. The server could somehow identify the sender of a message.
The question here is really if the L4 IPC "capabilities", ie communication end points, can be used to implement "capabilities" of a capability system. The answer of course differs, depending on which requirements you choose for the capability system. We are looking here specifically at a requirement for a certain form of rights amplification, or synergy in general.
Oh, now I understand what you want to do: Implementing a "real capability" system on top of L4 with the ability to amplify rights.
You seem to say that this should not even be attempted.
I beliefed that you can build a system without kernel support for rights amplification.
I think that there is an excellent chance that the upcoming security features can be used to build a fast and secure object/capability system. Solving the problem of multiple object references in messages is essential for that. However, I am even more sure that without any kernel support, implementing a competitive and secure capability system on top of L4 is nigh impossible.
Ok, you need kernel support for that. The remaining open question is: What is the right operation for that?
The map_lookup() has the disadvantage that it is time bound by the depth of the mapping tree. And it only works within the mapping hierarchy. This is a problem for example with external object caches or different proxies which try to map_lookup() siblings of the mapping tree.
So why not use a more general and faster operation like cmp()?
bool cmp(Address first, Address second)
Are there any arguments against this stronger operation?
Thanks,
Bernhard
At Sat, 11 Jun 2005 12:15:03 +0200, Bernhard Kauer kauer@os.inf.tu-dresden.de wrote:
Ok, you need kernel support for that. The remaining open question is: What is the right operation for that?
That's what I want to figure out, with your help.
The map_lookup() has the disadvantage that it is time bound by the depth of the mapping tree. And it only works within the mapping hierarchy. This is a problem for example with external object caches or different proxies which try to map_lookup() siblings of the mapping tree.
I have no such requirements on siblings, but I also don't know how to implement external object caches or different proxies, and maybe don't even know what they are. So, if it is a disadvantage, it's not one to us, so I can not really comment on that.
cmp() was certainly something I have thought of myself, before I realized that our own requirements were less broad and thus I could be much more specific, making it easier for me to justify our own requirements. (It would not be proper for me to suggest a cmp() operation if my own arguments only suggest a comparison within the hierarchy is necessary).
So why not use a more general and faster operation like cmp()?
bool cmp(Address first, Address second)
Are there any arguments against this stronger operation?
If only comparison is possible, then we have to carry along an identifier that allows us to look up the intended object in the first place, so that lookup can still be O(1). That's not a big problem, I guess, but it is not terribly exciting either, because such an identifier would have some "globalness" to it (if it is shared by the reference counter and the server providing the object) that complicates things all throughout the system. It increases message size and complicates the message format a bit, and probably makes orthogonal persistence harder to get right. But my gut feeling is that it would still be feasible.
There might be security implications if you can compare across the mapping hierarchy, in that it reveals information that you could otherwise not guess. If you only work within the mapping hierarchy, at least some argument can be made that not too much information is revealed: The caller could make---in a random sample of the cases--- an informed guess if the provided mappings were the same by unmapping one of them and quickly probing if the other is still valid. This is not terribly reliable, but at least it shows that you can not make mappings within the hierarchy completely independent/anonymous. The same is not true for mappings across the mapping hierarchy, where no information can be gained actively without cooperation from other tasks. This suggests to me that at least some careful consideration must be done before the functionality is extended in this way.
But to repeat, for myself I don't see any application for comparisons across the mapping hierarchy. That may very well be pure ignorance on my side: My own use of L4 is pretty narrow in itself. That's really something that you have to think about, that's why you are the experts :) If you say a more generic variant of the same functionality is best, then that is likely fine for us, too.
As I said in my original mail, I gave the specific semantics of map_lookup to illustrate what our specific needs are. There are potentially many other different ways to achieve the same functionality, some of them with extra costs and/or extra benefits. Most of them however with more semantical implications than I am daring enough to suggest - so minimality was important for me to originally limit the discussion to what is really relevant to us.
To me personally, it is not obvious that a cmp() function is "stronger", it just seems to be different, it seems to be sufficient for us, but also add extra costs for our use of it, which may very well be acceptable though. It's also potentially less secure, but in a way that likely doesn't matter to us. For us, I don't see a benefit, but that may be totally different for other users.
I do agree that the semantics of map_lookup seem to be very narrow, maybe too narrow to be an attractive element of a kernel design. But is cmp() the right generalisation? I am not sure. You decide. I am happy to comment on any ideas you come up with, and interpret them from our point of view, as good as I can. But I am sorry to aim myself so low: If I had a grand design idea about how to do this in a generic and all-encompassing way, I would share it. But all I can point out is our own modest requirements and offer them as a test case for anything better you can come up with.
Thanks, Marcus
At Thu, 09 Jun 2005 18:30:37 +0200, Marcus Brinkmann@ruhr-uni-bochum de wrote:
There may be security implications in revealing mapping "identities" at all. Maybe this feature needs to be restricted for confined tasks.
I have thought about this a bit more and think that this can be done very easily by using an access right bit, just like rwx are access right bits for memory mappings. If the bit is set, you are allowed to traverse the mapping tree through this node, and if you want to disallow it, you clear the bit in a mapping you give away.
The interesting part about this bit would be that it is entirely local: On every mapping, you can set or clear it. It's not like with rwx, where you can only clear and not set.
For our scheme, this could be used, for example, to hand out a capability temporarily (by delegating a mapping for it), but being sure that we can revoke it later, because the receiving task can not acquire its own reference. Having the guarantee that you can revoke a capability seems to be a useful option to have in a capability system.
But I should add that this is a feature I have still not yet entirely thought through. I just send this mail to frame it in terms of well-known L4 concepts, namely access right bits.
Thanks, Marcus
[Marcus Brinkmann]
Each mapping is identified in the mapper's address space via a virtual address (or index, or slot number, ...). Then there could be a kernel system call
vaddr_t map_lookup (vaddr_t addr);
which walks up the mapping tree, starting at the mapping entry for ADDR and going to the parent of the current node in each iteration until a mapping in the same address space is encountered. The vaddr of this mapping is returned.
Since I'm linf of involved with these sort of questions myself it seems like the right thing for me to share my thoughts on the subject.
I thought about this sort of functionality when doing my own work, but abandoned it mainly because of the need to traverse the mapping tree. Before going into detail here let me just give a short note on the two variants of the operation itself.
As you suggest yourself, one possible variant of this scheme is to have the lookup occur during IPC. This is really what you would want to do because:
o You don't have to create new (temporary) mappings during IPC.
o You don't have to later revoke the mapping either explicitly or due to overmapping in later IPCs.
What kills you here is the revocation part because revocation (of memory mappings) involves TLB flushes that can be pretty expensive (especially if you need to do TLB shootdowns in an MP system). Moreover, you never really need to have the page mapped into your address space in the first place since you're only interested in the "translation" of the page. Mapping it is therefore overkill.
I agree that having an operation that can identify one's own virtual address of a mapping can be pretty neat. Many people on this list have however pointed out the nasty aspects of it. The killer nastiness here (IMO) is that you may potentially need to traverse the mapping tree all the way up to the root. This means that no matter how clever a locking scheme you use for the mapping database, you may always need to somehow lock the root nodes in the tree.
What you really want is some sort of O(1) opeartion, in partciluar if you want to make it feasible to implement this lookup as part of the fast IPC path. A cmp() operation has been suggested that can indeed do the lookup in O(1). Cmp() has a couple of drawbacks, though:
o It still requires lookups in the page tables---meaning higher cache footprints, locking for consistency reasons, etc.
o It reaveals information across mapping hierachies. This could be a potential security issue.
Another problem with cmp() is that as far as I understand it doesn't do what Marcus asks for. That is, it does not identify a particular vaddr within the address space. Such functionality must be implemented by successively comparing each page in the address space. For this reason it does not make sense to impement cmp() as part of the IPC opertaion, which means that you have all the side effects of temporary mappings that I mentioned above.
To summarize, all in all I agree that the map_lookup() function is semantically really neat. It's just a pity that the implementation of it has certain issues.
Now, just for the sake of confusing people even more, let me throw in another idea I've been playing around with. My take on the problem of application specific capability transfer is to introduce another type of address space; an ID space. One can fabricate new IDs within one's ID space and also map/grant/unmap such IDs between ID spaces.
The fabricate operation would take two parameters:
Fabricate_Id (id, number) --- where id is the location within one's ID space and number is a number associated with that ID.
When the user does an IPC it can specify to send an ID as some form of protected payload to the destination. On the destination side, the receiver specifies an ID to use for translation purposes. During the transfer of the ID the kernel checks if the two IDs reside within the same owner space. If so, the actual transferred value is the number specified when creating the ID. If not, an "invalid" value is transferred.
Translation of an ID takes constant time. It simply involves checking whether the owner space of two IDs are the same. An extension of the translation process is to allow a small number (say 2--4) IDs be specified as translation IDs. Another extension is to have an access right that allows the ID to be used for translation purposes. Without this right the ID can only be used as a protected payload.
An example of how the ID stuff could work:
A server creates a number of IDs that it maps off to clients. The clients can delegate these IDs off to other applications if it wants. Upon receiving requests from clients the server specifies one of its IDs as a translation ID. If the client has chosen to transfer an ID that was created in the server the server will receive the number associated with that ID.
Another example:
A server in the system creates one ID for every principal in the system. The principal IDs are mapped to different ID spaces. Upon receiving an IPC other servers in the system can use their principal IDs to figure out the principal IDs of the client requests. Applications can temporarily map their principal IDs to other applications to make these applications assume the identity of themselves.
The two examples show two different usage cases of IDs. In the former case the ID is used to identify something managed by the server itself. In the latter case the ID is used to identify something managed by some other entity in the system.
I realize that these IDs do not directly solve the problems you are posing. That is, it doesn't allow you to verify that, e.g., a memory mapping you receive really is what you expect it to be. You can do certain tricks here though. For example, a server can create an ID associated with every memory mapping it hands out (i.e., it maps the ID together with the page frame). Someone down in the mapping hierarchy can then use this ID to prove to the server that it has a capability to the page frame. This would for example solve the reference counting problem you posed.
I hope my explanations are clear enough, and perhaps helpful. I really do appreciate that someone is trying to use the various models to actually solve real problems.
eSk
At Wed, 15 Jun 2005 14:44:07 +0200, Espen Skoglund wrote:
Since I'm linf of involved with these sort of questions myself it seems like the right thing for me to share my thoughts on the subject.
Thanks for jumping in here, it's very helpful.
I agree that having an operation that can identify one's own virtual address of a mapping can be pretty neat. Many people on this list have however pointed out the nasty aspects of it. The killer nastiness here (IMO) is that you may potentially need to traverse the mapping tree all the way up to the root. This means that no matter how clever a locking scheme you use for the mapping database, you may always need to somehow lock the root nodes in the tree.
I realize that the operation I suggested is not actually one that is particularly desirable from its implementation properties. I used the map_lookup operation to describe very precisely the narrow semantics I need in implementing the type of systems I want to implement.
I am very happy to hear that you have already considered such an operation. The performance effects you enumerate sound convincing.
Another problem with cmp() is that as far as I understand it doesn't do what Marcus asks for. That is, it does not identify a particular vaddr within the address space.
This is true, but I should add that I am not really asking for the vaddr, but for the bigger picture of identifying objects. I hope I made this sufficiently clear (in retrospect, I could have made it clearer).
Now, just for the sake of confusing people even more, let me throw in another idea I've been playing around with. My take on the problem of application specific capability transfer is to introduce another type of address space; an ID space. One can fabricate new IDs within one's ID space and also map/grant/unmap such IDs between ID spaces.
Thank you for the detailed description of how ID spaces are designed. I knew you had something like this in mind, but what I was missing was the detail that you can map the ID object, but also the right to use it for translation purposed. With this missing piece in the puzzle, I am convinced that ID spaces can work for me.
I realize that these IDs do not directly solve the problems you are posing. That is, it doesn't allow you to verify that, e.g., a memory mapping you receive really is what you expect it to be. You can do certain tricks here though. For example, a server can create an ID associated with every memory mapping it hands out (i.e., it maps the ID together with the page frame). Someone down in the mapping hierarchy can then use this ID to prove to the server that it has a capability to the page frame. This would for example solve the reference counting problem you posed.
Yes, indeed. I would actually do it the following way: The reference counter would become the global object server. It would create all the ID objects for all objects in the system, on behalf of the servers providing the objects. It would map the right to do the translation to the server providing the object. The server can then identify the objects in incoming RPCs without any additional RPCs to the global object server. The reference counter can naturally identify all capabilities, as it is the principal provider of the corresponding ID objects.
This would work very well. It would have slightly different properties than a purely local model, but no differences that matter to us.
It would require to separate communication channels from objects, but this actually is a benefit, because more optimizations are possible.
I think that such ID objects would be, all things considered, a much better solution than any type of map_lookup() or cmp() function. It does, however, require a new type of kernel objects. What do the Dresden people think about new kernel objects like this?
Thanks, Marcus
On Wed, 2005-06-15 at 14:44 +0200, Espen Skoglund wrote:
Now, just for the sake of confusing people even more, let me throw in another idea I've been playing around with. My take on the problem of application specific capability transfer is to introduce another type of address space; an ID space. One can fabricate new IDs within one's ID space and also map/grant/unmap such IDs between ID spaces.
The fabricate operation would take two parameters:
Fabricate_Id (id, number) --- where id is the location within one's ID space and number is a number associated with that ID.
When the user does an IPC it can specify to send an ID as some form of protected payload to the destination. On the destination side, the receiver specifies an ID to use for translation purposes. During the transfer of the ID the kernel checks if the two IDs reside within the same owner space. If so, the actual transferred value is the number specified when creating the ID. If not, an "invalid" value is transferred.
Translation of an ID takes constant time. It simply involves checking whether the owner space of two IDs are the same. An extension of the translation process is to allow a small number (say 2--4) IDs be specified as translation IDs. Another extension is to have an access right that allows the ID to be used for translation purposes. Without this right the ID can only be used as a protected payload.
This sounds fairly the same like the proposed cmp() operation, the only difference is that you do it in combination with IPC, wich is only a performance argument, and however may be the right way.
The similarity is as follows: The compare operation takes two capabilites (only communication capabilites are valid). The first one is well known by the one that does the compare and must have a compare right (similar to your translation right). The second capability is the one which is unknown. If the compare right is given and the two capabilites point to the same object (in our case this would be an end point) the 'badge' which is a protected number assigned to this capability is returned by the compare and is therefore similar to your protected name. If either the compare right is not given or the capabilities are not communication caps or the point to different objects the compare will return the invalid badge.
I think, if we have a way for cheap temporary mappings of capabilities the compare is more flexible, however if we havn't we should consider doing the compare during IPC by giving a local capability to compare to in the receive and getting the compare result in the message.
At Wed, 15 Jun 2005 17:48:29 +0200, Alexander Warg alexander.warg@os.inf.tu-dresden.de wrote:
I think, if we have a way for cheap temporary mappings of capabilities the compare is more flexible, however if we havn't we should consider doing the compare during IPC by giving a local capability to compare to in the receive and getting the compare result in the message.
It's not quite the same because then you would have to know in advance which capability the user will likely provide. In Espen's model, the two capabilities only need to have the same owner to make the translation happen.
This limits the usefulness of the optimization to do the cmp during IPC.
I am not sure what type of optimizations would be possible within the L4.sec model. It seems to be a difficult problem. Because L4.sec has multiple possible receivers, it doesn't have a single owner to validate a translation efficiently (that's why you don't really want a lookup, but "just" a comparison). But from the point of view of a user, Espen's ID objects, at least in isolation, seem to be all-around easier to use and likely more efficient (in fact, close to optimum, really).
Thanks, Marcus
On Thu, 2005-06-16 at 01:48 +0200, Marcus Brinkmann wrote:
At Wed, 15 Jun 2005 17:48:29 +0200, Alexander Warg alexander.warg@os.inf.tu-dresden.de wrote:
I think, if we have a way for cheap temporary mappings of capabilities the compare is more flexible, however if we havn't we should consider doing the compare during IPC by giving a local capability to compare to in the receive and getting the compare result in the message.
It's not quite the same because then you would have to know in advance which capability the user will likely provide. In Espen's model, the two capabilities only need to have the same owner to make the translation happen.
I'm not sure how this works, what is the definition of owning a capability. I think there is no such definition and because of that you need as far as I undersood in espens model provide a capability with the translation right that has the same owner space (whatever the difinition of this is, and how can one transfer ownership?) --- so you need to provide a capability for the translation which you know in advance has the same owner space as the one likely to be sent to you.
In our model we have no definition of an owner of a capability (in my terms every one that has a capability for an object owns that capability). So we decided to have to specify one specific capability (as in Espens model) which only needs to point to the same object as the one send in the message.
If you have some idea how to describe ownership of an object other than having a capability with a specfic right like in our case the compare right please let us know.
I am not sure what type of optimizations would be possible within the L4.sec model. It seems to be a difficult problem. Because L4.sec has multiple possible receivers, it doesn't have a single owner to validate a translation efficiently (that's why you don't really want a lookup, but "just" a comparison). But from the point of view of a user, Espen's ID objects, at least in isolation, seem to be all-around easier to use and likely more efficient (in fact, close to optimum, really).
I'm not rerally sure, I would like to have ownership seperated from object creation and have a mechanism to transfer or share ownership. Because not having this limits the systams which may be build extremely and things such as transparent proxies are very dificult to implement.
We want no lookup because it is not only inefficent but also may return an unintended result. Because there may be more than one capability pointing to the same object in one address space (aliasing). The result of a lookup may return a capability to the object which the server itself may not have a interpretation for, because some library could have mapped an alias that the server does not know of.
The separate compare has the following advantage you can comapre with multiple local capabilities even if they point to different objects served by different owners, however a model of trust may exist for a bunch of capabilities that point to objects at a different server.
At Thu, 16 Jun 2005 09:19:12 +0200, Alexander Warg alexander.warg@os.inf.tu-dresden.de wrote:
I'm not sure how this works, what is the definition of owning a capability. I think there is no such definition and because of that you need as far as I undersood in espens model provide a capability with the translation right that has the same owner space (whatever the difinition of this is, and how can one transfer ownership?)
It would make sense to me if it is the address space of the thread (or the thread itself) that created the ID object in the first place. I don't think that ownership can be transfered. I think it works similar to receive rights, which are, in Espen's model, bound to a thread, and can not be moved or copied, either. It's fixed at creation time.
--- so you need to
provide a capability for the translation which you know in advance has the same owner space as the one likely to be sent to you.
Yes, but that is easy enough. As Espen said, as an extension, you could allow to provide a small number (Espen said "2-4") of translation IDs that are checked.
I'm not rerally sure, I would like to have ownership seperated from object creation and have a mechanism to transfer or share ownership.
In Espens model, one extension is to have one type of mapping of ID objects (via access bits, I presume) that maps the ability to read out the protected payload.
The difference in the two models is not the notion of "ownership". In L4.sec, "ownership" is defined by having the receive right. In Espen's ID object model, ownership would be defined by two things: The actual owner of the ID object, used for comparison, and the right to read out the payload.
The difference of the two models is that in Espen's model you read out a payload, that is the kernel stores a user-settable word in association with the ID object, while in L4.sec's model with cmp(), you have to know the object to identify before you can do the comparison. This is the main difference, and it has consequences. One is that the comparison can only happen after the information about which objects the capability is for was transfered. So either you would have to split up a request into two IPCs, one to transfer the object identifiers, and one to transfer the actual capabilities, or you would have to transfer both and let the server do the cmp() post-IPC (which is the only feasible way to do it, IMO, as having a single request split up in two IPCs is just an open door for denial of service attacks in a server-client model).
We want no lookup because it is not only inefficent but also may return an unintended result. Because there may be more than one capability pointing to the same object in one address space (aliasing). The result of a lookup may return a capability to the object which the server itself may not have a interpretation for, because some library could have mapped an alias that the server does not know of.
Although map_lookup is not longer in the picture, let me say that it didn't have the aliasing problem, because the ancestor would be unique.
For your model, the aliasing issue exists, but you could still allow the user to store a payload with a communication point that is subsequently read out.
The separate compare has the following advantage you can comapre with multiple local capabilities even if they point to different objects served by different owners, however a model of trust may exist for a bunch of capabilities that point to objects at a different server.
It's true that in Espen's model the number of potential owners of an object at translation time is limited to either 1 or a small number. That's a limitation of some sort, it's true, but not one that concerns me a lot (or at all, that is).
Thanks, Marcus
At Thu, 16 Jun 2005 11:44:31 +0200, Marcus Brinkmann@ruhr-uni-bochum de wrote:
The difference in the two models is not the notion of "ownership".
The difference of the two models is that in Espen's model you read out a payload,
It's awful if you get lost in one detail just to forget about another :)
As I said, the payload makes a big difference, but as you said, that there is a unique owner in Espen's model equally makes a big difference, because only then you can do a fast verification and lookup. It's really the combination of both which makes it work.
Thanks, Marcus
[Marcus Brinkmann]
I'm not sure how this works, what is the definition of owning a capability. I think there is no such definition and because of that you need as far as I undersood in espens model provide a capability with the translation right that has the same owner space (whatever the difinition of this is, and how can one transfer ownership?)
It would make sense to me if it is the address space of the thread (or the thread itself) that created the ID object in the first place. I don't think that ownership can be transfered. I think it works similar to receive rights, which are, in Espen's model, bound to a thread, and can not be moved or copied, either. It's fixed at creation time.
Just to clarify; my choice of words when I said "owner space" could probably have been better. What I really meant was "origin space" or "space in which the root mapping for the ID resides" (let's call it "root space"). If you use the latter definition it should be clear that one can transfer "ownership" by granting the root ID to another space. This gives a bit more flexibility to the model, has some minor implementation issues, but nothing that one can't deal with.
We want no lookup because it is not only inefficent but also may return an unintended result. Because there may be more than one capability pointing to the same object in one address space (aliasing). The result of a lookup may return a capability to the object which the server itself may not have a interpretation for, because some library could have mapped an alias that the server does not know of.
Although map_lookup is not longer in the picture, let me say that it didn't have the aliasing problem, because the ancestor would be unique.
Right. Aliasing where the application can use two names to identify the same physical resource: good. Aliasing where the kernel has to choose the name to idenity a physical resource with: bad.
The model I've proposed---both regarding ID spaces and receive endpoints being fixed to specific threads---was chosen specifically because it avoids aliasing problems. This means that there are both no semantic ambiguities and that one can implement the model really efficiently.
eSk
l4-hackers@os.inf.tu-dresden.de