I have a note about mappings and address spaces pending, but before I send it I want to ask a naive question about the map operations:
Consider a sequence of operations in which
A maps to B B maps same stuff to C.
Is there some variant of the 'map' operation in which B does NOT retain the ability to revoke C's mapping? That is, in which the resulting mapping dependency hierarchy is
A / \ B C
instead of
A | B | C
That is, is there a way for a sender to give a receiver rights on data that are co-equal in all respects with those of the sender?
If not, is it important?
If such a "coequal map" operation existed, how often would it be used in contrast to the current map operation?
shap
"Jonathan S. Shapiro" shap@eros-os.org writes:
I have a note about mappings and address spaces pending, but before I send it I want to ask a naive question about the map operations:
Consider a sequence of operations in which
A maps to B B maps same stuff to C.
Is there some variant of the 'map' operation in which B does NOT retain the ability to revoke C's mapping?
I am not sure whether I understood the question correctly but there is the grant operation.
After A maps to B and B maps to C it would look like
A -> B -> C
in contrast to A maps to B and B grants to C which would look like
A -> C
That is, in which the resulting mapping dependency hierarchy is
A / \ B C
This graph would be a result of the following operations
A maps to B A maps to B and B grants to C
There is no single operation which could be used as a replacement for the map/grant sequence.
That is, is there a way for a sender to give a receiver rights on data that are co-equal in all respects with those of the sender?
No, there ist only grant, where the sender transfers its own rights to the receiver and has no access to the page afterwards.
If not, is it important?
I don't know. So far the map/grant combination was enough to handle all scenarios. Do you have a scenario in mind where such an operation would be useful?
Regards, Jean
On Thu, 2003-12-04 at 15:06, Jean Wolter wrote:
"Jonathan S. Shapiro" shap@eros-os.org writes:
Is there some variant of the 'map' operation in which B does NOT retain the ability to revoke C's mapping?
I am not sure whether I understood the question correctly but there is the grant operation.
After A maps to B and B maps to C it would look like
A -> B -> C
in contrast to A maps to B and B grants to C which would look like
A -> C
Thanks for replying, Jean. Let's see if I can clarify the question:
I am looking for a sequence in which A maps to B and B ??? to C and the result would look like:
A -> B A -> C
Is there such an operation?
Also, I forgot to ask: is there some reason why such an operation would be inadvisable?
Jonathan wrote:
I am looking for a sequence in which A maps to B and B ??? to C and the result would look like:
A -> B A -> C
Is there such an operation?
One possibility would be for B to request A to map the page on behalf of B. This would result in the above graph. I think your question relates to is there a possibility for A to avoid B mapping the page directly to B, i.e. is there a way for A to enforce that the above graph can only be created as such and particularly not in the form A - B - C. Do I get this right?
We are currently thinking about ways to restrict communication such, that this particular restriction is possible. The relevant question thereby is:
1) Should B be allowed to send messages to C, but no mappings? 2) Should B be allowed to send messages to C, but C not back to B?
In the second case we need a way to restrict mapping such, that the result is to map pages read only. Otherwise a read-writable page can be used as a back channel from C to B.
Marcus
[Jonathan S Shapiro]
I am looking for a sequence in which A maps to B and B ??? to C and the result would look like:
A -> B A -> C
Is there such an operation?
No.
Also, I forgot to ask: is there some reason why such an operation would be inadvisable?
Conceptually, no. The only problem is that I don't see any scenario where such an operation is strictly necessary, i.e., a scenario where we can not solve the same problem with the current sceheme.
Another (perhaps minor) problem I see with this scheme is that if/when we introduce management of kernel memory from user-level [1], there might be a problem of accounting. For each mapping there is a "mapping node" in the in kernel mapping database. Upon a map operation, the memory for this mapping node can be accounted to either the mapper or the mappee. If memory is accounted to the mappee there is no problem with your new ??? operation. If the memory is accounted to the mapper, however, we have two choices:
1. Account the memory to A. This has the problem that a child (B) can consume resources from its parent (A).
2. Account the memory to B. This kind of defeats the scheme of accounting memory to the mapper since B is not really the mapper after such an operation has been performed. Also, if B chooses to revoke the resouce for the mapping node it will break the mapping A -> C, which is probably not what you wanted in the first place.
Anyhow, Andy is probably the right person to say whether this is a real problem or not.
eSk
[1] http://i30www.ira.uka.de/research/documents/l4ka/userlevel-mgmt-of-kernel-me...
On Fri, 2003-12-05 at 05:30, Espen Skoglund wrote:
Another (perhaps minor) problem I see with this scheme is that if/when we introduce management of kernel memory from user-level [1], there might be a problem of accounting...
I will shortly send out a note about the similarities and differences between GPTs and EROS address spaces. I'm not sure how much these notes are benefitting anyone else, but *I'm* learning a lot. :-)
I'll read about user-level kernel memory before I try to say anything on that topic. Thank you.
Hello Espen,
[Jonathan S Shapiro]
I am looking for a sequence in which A maps to B and B ??? to C and the result would look like: A -> B A -> C Also, I forgot to ask: is there some reason why such an operation would be inadvisable?
Another (perhaps minor) problem I see with this scheme is that if/when we introduce management of kernel memory from user-level [1], there might be a problem of accounting. For each mapping there is a "mapping node" in the in kernel mapping database. Upon a map operation, the memory for this mapping node can be accounted to either the mapper or the mappee. If memory is accounted to the mappee there is no problem with your new ??? operation.
If the mapping node is accounted (and exported) entirely to the mappee, then validating it when it is re-imported from user level becomes a very hard problem. This is why I suggested splitting the mapping nodes into two separate parts, which can be validated against each other (in my thesis, Section 4.7.3).
However, there is another solution: Assume the current mapping is A->B, and B wants to pass the object on to C without retaining the ability to revoke it later. Then B can simply request a _second_ mapping from A and grant that one to C. The first step would look like this:
A -> B (at address b1) Accounting: A 0.5 B 0.5 A -> B (at address b2) Accounting: A 0.5 B 0.5
After B has granted the object at address b2 to C, the situation changes as follows:
A -> B (at address b1) Accounting: A 0.5 B 0.5 A -> C (at address c1) Accounting: A 0.5 C 0.5
In this scenario, B and C are charged with half a mapping node for each mapping they own. A is charged with a full mapping node, but since it knows that B requested both mappings, it can ask B for compensation.
- Andreas
On Dec 4, 2003, at 21:41, Jonathan S. Shapiro wrote:
I am looking for a sequence in which A maps to B and B ??? to C and the result would look like:
A -> B A -> C
Is there such an operation?
Also, I forgot to ask: is there some reason why such an operation would be inadvisable?
If you look at the mapping database as a *cache*, then your scenario becomes one which describes the cache behavior, and thus isn't so important in light of the semantics of user-level correctness. In any cache, it is possible for the entries to disappear, and they must be resupplied when needed later. The page fault IPC informs of cache misses in the mapping database. Upon page fault, the mappings must be reconstructed.
The grant operation can be viewed as an operation which removes an entry from the cache. If B accesses the mapping which it just dropped (via granting to C), A will receive a page fault. And at user-level, A should know that it has mapped to B, and thus can decide to resupply the mapping to B.
When you consider the mapping database from the cache perspective, it isn't so upsetting to have an undirected unmap operation. An unmap brutally removes all entries, and thus it becomes necessary to repopulate the cache. And obviously, to repopulate the cache, A must know about its outstanding mappings. And this is where the cache model can break down; how can A know that it should have a mapping to C, after B granted the mapping to C? A user-level protocol must be used to inform A that it has the duty of maintaining mappings to C. But the grant operation probably should be carefully used; more in the case to support transparent interposition, so that A thinks it maps to Y, but really maps to Z due to some indirection. And then Z transparently forwards the mapping to Y via grant(), to construct the original mapping hierarchy intended by A.
-Josh
On Fri, 2003-12-05 at 05:58, Joshua LeVasseur wrote:
On Dec 4, 2003, at 21:41, Jonathan S. Shapiro wrote:
I am looking for a sequence in which A maps to B and B ??? to C and the result would look like:
A -> B A -> C
Is there such an operation?
Also, I forgot to ask: is there some reason why such an operation would be inadvisable?
If you look at the mapping database as a *cache*, then your scenario becomes one which describes the cache behavior, and thus isn't so important in light of the semantics of user-level correctness. In any cache, it is possible for the entries to disappear, and they must be resupplied when needed later.
Josh:
Thank you. I have always been very confused about this view of the mapping cache. It seems tied up with some sharing deficiencies. Would you be kind enough to expand a bit on how the cache gets rebuilt after some element gets removed?
Thanks!
On Dec 5, 2003, at 17:21, Jonathan S. Shapiro wrote:
On Fri, 2003-12-05 at 05:58, Joshua LeVasseur wrote:
If you look at the mapping database as a *cache*, then your scenario becomes one which describes the cache behavior, and thus isn't so important in light of the semantics of user-level correctness. In any cache, it is possible for the entries to disappear, and they must be resupplied when needed later.
Josh:
Thank you. I have always been very confused about this view of the mapping cache. It seems tied up with some sharing deficiencies. Would you be kind enough to expand a bit on how the cache gets rebuilt after some element gets removed?
One can look at the mapping database as providing a mechanism for the kernel to handle operations which would be too expensive from user level, as long as those operations require no policy decisions. So the mapping database evicts entries when a policy decision might be necessary (or if the implementation becomes too hairy).
If a mapping is missing from the kernel's cache, then a policy decision is necessary from user-level. The policy decision can be requested via a page fault request; the kernel synthesizes a page fault message to the faulting process's pager, and asks for the pager to supply an appropriate mapping. The pager could be the process which provides the mapping, or it could be another thread within the process which receives the mapping. In either case, the model can support some type of user-level policy which is most appropriate to provide the mappings. For example, the page fault might be delivered to a client of a file server. The page fault indicates to the client that it has hit the edge of the file, and must request more data from the file's disk backing. The client can then IPC to the file server and provide appropriate performance hints, and ask for the mapping. Or the page fault IPC might be sent directly to the file server (as in L4Linux), and the file server then uses prenegotiated information to figure out the file mapping (such as the elf headers of an executable).
User apps must maintain enough information to construct on-the-fly mapping requests. They can't drop such information after constructing the mapping (a top-level pager may unmap() a deep hierarchy of mappings!). But this information can be highly compressed since it is domain specific. For example, a file server which responds to a client's page fault may know that the client maps the file contiguously, and so can use offsets to figure out the mapping, without using an expensive page table.
Hope my ramblings helped, Josh
On Fri, 2003-12-05 at 12:34, Joshua LeVasseur wrote:
On Dec 5, 2003, at 17:21, Jonathan S. Shapiro wrote:
If a mapping is missing from the kernel's cache...
This statement is very closely related to my earlier question about mechanisms. When you use the term "mapping" above, what exactly do you mean:
1. A GPT? 2. Some other cache that is somehow constructed from a GPT?
The entire mapping DB is something that I am still failing to understand adequately.
shap
On Sat, 5 Dec 2003, Jonathan S. Shapiro wrote:
On Fri, 2003-12-05 at 12:34, Joshua LeVasseur wrote:
On Dec 5, 2003, at 17:21, Jonathan S. Shapiro wrote:
If a mapping is missing from the kernel's cache...
This statement is very closely related to my earlier question about mechanisms. When you use the term "mapping" above, what exactly do you mean:
- A GPT?
- Some other cache that is somehow constructed from a GPT?
Jon you seem a bit confused ;) As Gernot already mentioned GPTs are a page table implementation. Basically in L4 you have two VM data structures. A page table per address space (GPTs being one such implementation), and the mapping database which maintains the mapping tree for each physical frame of memory in the system, this is only required to implement the unmap/flush operation.
The entire mapping DB is something that I am still failing to understand adequately.
For a summary of the interaction between L4's VM API, the page tables and the mapping database have a look at Chris's u/g thesis. ftp://ftp.cse.unsw.edu.au/pub/users/disy/ug-theses/99-cls.ps.gz
Cheers, Adam
-- Adam "WeirdArms" Wiggins School of Computer Sci. & Eng. PhD Student The University of NSW Phone: +61 2 9385 7359 UNSW SYDNEY NSW 2052, Australia Fax: +61 2 9385 7942 http://www.cse.unsw.edu.au/~awiggins
On Fri, 2003-12-05 at 12:34, Joshua LeVasseur wrote:
On Dec 5, 2003, at 17:21, Jonathan S. Shapiro wrote: One can look at the mapping database as providing a mechanism for the kernel to handle operations which would be too expensive from user level, as long as those operations require no policy decisions. So the mapping database evicts entries when a policy decision might be necessary (or if the implementation becomes too hairy).
If a mapping is missing from the kernel's cache, then a policy decision is necessary from user-level. The policy decision can be requested via a page fault request; the kernel synthesizes a page fault message to the faulting process's pager...
Hope my ramblings helped, Josh
This was VERY helpful. It assists me in understanding a number of points of confusion that I have had in previous discussions with various L4 people.
What you describe is one of the key differences between the L4 and EROS designs. As I am coming to understand L4 better, I'm coming to think that it is the only really *important* difference between the two architectures:
- In EROS, a node tree DEFINES an address space. Since it is the ultimate defining data structure, it can be paged but not discarded (or at least, not without losing the mappings).
- In L4, the mapping table is a cache of things defined at user level.
shap
Hi All, time to play catch up now exam marking etc is over. I apologise in advance if this has be pointed out in another thread.
Just a small, but important point relating to the statement below.
The L4 mapping database is NOT strictly a cache, but it can be used to implement cache-like behaviour which is usually what is done in most systems. Kernel mappings are a dynamic subset of things defined at user-level. However, in kernel mapping can be used to indirectly define an address space or region of an address space (if, for example, you what to avoid faults). I.e. it can equal or be a superset of what is defined at user-level. However, doing so requires carefully crafted and coordinated pager behaviour/protocol.
You can term it a (large) cache with mechanisms to control its population, but it is not exactly a "cache" in the terms usual sense of the word.
- Kevin
What you describe is one of the key differences between the L4 and EROS designs. As I am coming to understand L4 better, I'm coming to think that it is the only really *important* difference between the two architectures:
In EROS, a node tree DEFINES an address space. Since it is the ultimate defining data structure, it can be paged but not discarded (or at least, not without losing the mappings).
In L4, the mapping table is a cache of things defined at user level.
shap
l4-hackers mailing list l4-hackers@os.inf.tu-dresden.de http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers
l4-hackers@os.inf.tu-dresden.de