L4Re: region mapper/manager concurrency considerations and resource limits

List overview All Threads
Download

newer

older

Invitation: Systems Software...

Boot problems with Fiasco.OC on...

Paul Boddie

7 Feb 2021 7 Feb '21

1:54 a.m.

Hello,

Apologies for introducing two potentially separate topics, but they both seem to present themselves in my most recent endeavour. In short, I am wondering about the following...

What considerations need to be given to concurrency with regard to the use of the l4re_rm_attach function and similar functionality? Is access to the region mapper/manager from different threads in the same task advisable, are there thread safety issues, and if so, can these be sensibly worked around?

(I have observed what seems to be the same address being provided for two different dataspaces at the same time, which does not appear to happen when I introduce locking around calls to l4re_rm_attach. Are such problems expected or could this be some other problem manifesting itself?)

What kind of resource limits apply to L4Re tasks? In my work, I am allocating capabilities for objects acting as dataspaces. Things seem to work fairly well up to a certain point, and then capability allocation seems to fail, and program failure is then not particularly graceful.

(I currently observe that creating 160 threads, each with its own IPC gate capability seems to be unproblematic, but more than this will produce errors that manifest themselves either in pthread library code or in my own code where a critical section is supposed to be enforced.)

I suppose some contextual remarks might be useful. My previous work involved providing some simple filesystem-related functionality through filesystem servers, where a new server object (exposed via an IPC gate and having its own thread) is created for each open file. Some stress testing in a multiprocessor environment indicated that various concurrency issues were highly likely to be present.

(I also probably didn't provide general enough support for flexpages, but since all my "send" flexpages were single machine pages, I don't think this would have caused additional problems or confusion.)

Consequently, I migrated the principal mechanisms to my Linux environment and wrote a simulation to explore the concurrency situation, identifying and hopefully remedying various deficiencies (including more general flexpage support). This new endeavour involves bringing this tested functionality back into L4Re, wiring it up with the IPC system, and testing using multithreaded clients.

I might also ask about support in L4Re for C++ threading constructs. When developing my simulation, I decided to use the standard C++ support for threading, mutexes, condition variables, and so on, as opposed to using pthread functionality. I imagine that C++ library support for these things just wraps the pthread functionality, but I wonder if there are not other considerations that I might need to be aware of.

Sorry if this is in any way vague, but since there isn't really any discussion about system architecture or implementation on this list with respect to these technologies (or really any technologies), I find myself experimenting with the available software to determine techniques and approaches that I hope will be viable. Nevertheless, any constructive remarks or advice would be greatly appreciated.

Thanks in advance,

Paul

Show replies by date

Christian Ludwig

8 Feb 8 Feb

9:54 a.m.

Hi,

On Sunday, 07.02.2021, 01:54 +0100 Paul Boddie wrote:

...

What considerations need to be given to concurrency with regard to the use of the l4re_rm_attach function and similar functionality? Is access to the region mapper/manager from different threads in the same task advisable, are there thread safety issues, and if so, can these be sensibly worked around?

Calls to the region manager are IPC calls and are synchronized by a wait-queue in the kernel. The region manager thread only serves one client at a time. In fact if you use multiple threads in a task it serves all page-faults for all threads already. It's thread-safe by design.

However, the capability slot allocator (l4re_util_cap_alloc) is not thread-safe. You need to introduce some synchronization if you use it from multiple threads.

Hope that helps.

- Christian

Paul Boddie

9 Feb 9 Feb

1:20 a.m.

On Monday, 8 February 2021 09:54:08 CET Christian Ludwig wrote:

...

On Sunday, 07.02.2021, 01:54 +0100 Paul Boddie wrote:

...
What considerations need to be given to concurrency with regard to the use of the l4re_rm_attach function and similar functionality? Is access to the region mapper/manager from different threads in the same task advisable, are there thread safety issues, and if so, can these be sensibly worked around?

Calls to the region manager are IPC calls and are synchronized by a wait-queue in the kernel. The region manager thread only serves one client at a time. In fact if you use multiple threads in a task it serves all page-faults for all threads already. It's thread-safe by design.

OK. I wasn't really sure about the concurrency aspects of the region manager itself. Some frustrating experience with my own code led me to question this rather than just take it for granted.

...

However, the capability slot allocator (l4re_util_cap_alloc) is not thread-safe. You need to introduce some synchronization if you use it from multiple threads.

This is very useful to know. I imagine that thread safety is regarded as unnecessary for many use-cases, but it does surprise me that it has to be introduced to make the allocator usable in a multithreaded program. Then again, I might be doing things that are not particularly normal for L4Re programs.

(There do not seem to be many systems based on L4Re out there for public perusal, nor is there much discussion around architectural patterns.)

...

Hope that helps.

Possibly more than you realise, but certainly more than I might have realised. :-)

Thanks for following up,

Paul

Philipp Eppelt

1:26 p.m.

Hi Paul,

On 2/9/21 1:20 AM, Paul Boddie wrote:

...

On Monday, 8 February 2021 09:54:08 CET Christian Ludwig wrote:

...
On Sunday, 07.02.2021, 01:54 +0100 Paul Boddie wrote:

...
What considerations need to be given to concurrency with regard to the use of the l4re_rm_attach function and similar functionality? Is access to the region mapper/manager from different threads in the same task advisable, are there thread safety issues, and if so, can these be sensibly worked around?

Calls to the region manager are IPC calls and are synchronized by a wait-queue in the kernel. The region manager thread only serves one client at a time. In fact if you use multiple threads in a task it serves all page-faults for all threads already. It's thread-safe by design.

OK. I wasn't really sure about the concurrency aspects of the region manager itself. Some frustrating experience with my own code led me to question this rather than just take it for granted.

...
However, the capability slot allocator (l4re_util_cap_alloc) is not thread-safe. You need to introduce some synchronization if you use it from multiple threads.

This is very useful to know. I imagine that thread safety is regarded as unnecessary for many use-cases, but it does surprise me that it has to be introduced to make the allocator usable in a multithreaded program. Then again, I might be doing things that are not particularly normal for L4Re programs.

I agree that an out-of-the-box mechanism is helpful here and will prevent headaches in many situations, plus you wouldn't need to know about this. However, the (internal) discussion did not yet reach consensus.

Once you know about this shortcoming, a thread-safety wrapper around cap_alloc is fortunately straight forward.

...

What kind of resource limits apply to L4Re tasks? In my work, I am allocating capabilities for objects acting as dataspaces. Things seem to work fairly well up to a certain point, and then capability allocation seems to fail, and program failure is then not particularly graceful.

(I currently observe that creating 160 threads, each with its own IPC gate capability seems to be unproblematic, but more than this will produce errors that manifest themselves either in pthread library code or in my own code where a critical section is supposed to be enforced.)

It looks like there is a capability limit per task at 4096 for L4Re::Util::cap_alloc. 160 thread (2 caps) + 160 IPC gates = 3*160 -> 480 caps

If you are handing out more than 3600 Dataspaces you might be running into this limit. Now I'm fuzzy on the details, so I'll just provide a general direction. I don't know if you can replenish the Util::cap_alloc by providing additional memory to it or if you need to setup your own allocator (see l4re-core/l4re/util/libs/cap_alloc.cc). For the latter, you can set it up with a bigger piece of memory, to increase the number of manage capability slots.

On 2/7/21 1:54 AM, Paul Boddie wrote:

...

I might also ask about support in L4Re for C++ threading constructs. When developing my simulation, I decided to use the standard C++ support for threading, mutexes, condition variables, and so on, as opposed to using pthread functionality. I imagine that C++ library support for these things just wraps the pthread functionality, but I wonder if there are not other considerations that I might need to be aware of.

You are right, std::thread wraps libpthread. I guess you know about l4re-core/libstdc++-headers/include/thread-l4?

To my knowledge the behavior for the C++ constructs don't differ from the pthread ones. Be careful with std::thread.join(), the thread needs to cooperate and terminate/exit by itself. If it doesn't, your caller waits for a long time.

If you directly interact with the UTCB or use low-level functions which setup the UTCB, make sure to not do any other function calls, as they might change the UTCB. (That's also a high scorer in the headache list.)

Cheers, Philipp

Paul Boddie

10 Feb 10 Feb

12:29 a.m.

On Tuesday, 9 February 2021 13:26:19 CET Philipp Eppelt wrote:

...

On 2/9/21 1:20 AM, Paul Boddie wrote:

...
This is very useful to know. I imagine that thread safety is regarded as unnecessary for many use-cases, but it does surprise me that it has to be introduced to make the allocator usable in a multithreaded program. Then again, I might be doing things that are not particularly normal for L4Re programs.

I agree that an out-of-the-box mechanism is helpful here and will prevent headaches in many situations, plus you wouldn't need to know about this. However, the (internal) discussion did not yet reach consensus.

Once you know about this shortcoming, a thread-safety wrapper around cap_alloc is fortunately straight forward.

Indeed. I find myself writing my own wrapper functions, anyway. In this case, I just set up a semaphore to protect the allocation functions, having another library function for its initialisation.

...

...
What kind of resource limits apply to L4Re tasks? In my work, I am allocating capabilities for objects acting as dataspaces. Things seem to work fairly well up to a certain point, and then capability allocation seems to fail, and program failure is then not particularly graceful.

(I currently observe that creating 160 threads, each with its own IPC gate capability seems to be unproblematic, but more than this will produce errors that manifest themselves either in pthread library code or in my own code where a critical section is supposed to be enforced.)

It looks like there is a capability limit per task at 4096 for L4Re::Util::cap_alloc. 160 thread (2 caps) + 160 IPC gates = 3*160 -> 480 caps

If you are handing out more than 3600 Dataspaces you might be running into this limit.

I don't think I should be remotely close to that. Just tracking the number of capabilities I'm allocating, however, and there really does seem to be a few thousand capabilities that are allocated by the end of the program.

I do wonder about how capabilities might be allocated to support the concurrency management features of the standard C++ library. In principle, a mutex might be implemented by a semaphore whose capability would only be allocated once in a task's lifetime (or an object's lifetime for object-level locking), with the guard operating on the semaphore via the mutex abstraction.

There shouldn't be so many mutexes allocated, and therefore not so many capabilities involved, but this is one thing I could imagine being troublesome. Anyway, it's very possible that I am not freeing explicitly allocated capabilities in my own code, so I guess I should pay some attention to that and see what might be happening.

...

Now I'm fuzzy on the details, so I'll just provide a general direction. I don't know if you can replenish the Util::cap_alloc by providing additional memory to it or if you need to setup your own allocator (see l4re-core/l4re/util/libs/cap_alloc.cc). For the latter, you can set it up with a bigger piece of memory, to increase the number of manage capability slots.

Maybe I don't understand everything, but I imagine that any increased memory area for capability slots would need to be accompanied by increased storage for kernel use (the "object space" mentioned in the online documentation), so that the association between capabilities and "kernel objects" can be maintained. A brief search suggests that l4_task_add_ku_mem might be related to this, but this is just a quick suggestion.

...

On 2/7/21 1:54 AM, Paul Boddie wrote:

...
I might also ask about support in L4Re for C++ threading constructs. When developing my simulation, I decided to use the standard C++ support for threading, mutexes, condition variables, and so on, as opposed to using pthread functionality. I imagine that C++ library support for these things just wraps the pthread functionality, but I wonder if there are not other considerations that I might need to be aware of.

You are right, std::thread wraps libpthread. I guess you know about l4re-core/libstdc++-headers/include/thread-l4?

Yes, I've used the pthread library elsewhere, it being described in the documentation, and it is also appropriate for C language use.

...

To my knowledge the behavior for the C++ constructs don't differ from the pthread ones. Be careful with std::thread.join(), the thread needs to cooperate and terminate/exit by itself. If it doesn't, your caller waits for a long time.

Indeed. I tend to expect my threads to terminate, so any problems with joining should indicate other problems. One motivation for using the C++ constructs is that they permit more concise code than the pthread library does, and since I don't care about things like setting priorities, their use facilitated consideration of the issues involved rather than distracting from those issues.

(In fact, when I decided to review my previous work, I actually prototyped the mechanisms in Python, whose concurrency support is acceptable in terms of the constructs provided, if not the actual provision of the performance benefits expected with concurrency. I then implemented the same mechanisms in C++ to reassure me about the approaches I had taken.)

...

If you directly interact with the UTCB or use low-level functions which setup the UTCB, make sure to not do any other function calls, as they might change the UTCB. (That's also a high scorer in the headache list.)

I certainly know from experience that I have to be very careful around UTCB- modifying operations. Indeed, one thing that I do which may be performance- unfriendly but which helped my productivity greatly was to wrap the IPC mechanisms with abstractions and functions that stay well away from the UTCB.

Thanks for the advice and technical details!

Paul

Paul Boddie

11 Feb 11 Feb

1:32 a.m.

On Wednesday, 10 February 2021 00:29:20 CET Paul Boddie wrote:

...

There shouldn't be so many mutexes allocated, and therefore not so many capabilities involved, but this is one thing I could imagine being troublesome. Anyway, it's very possible that I am not freeing explicitly allocated capabilities in my own code, so I guess I should pay some attention to that and see what might be happening.

The problem was actually in the way I was handling expected items in IPC calls. In my own IPC library, I allocate capability slots and set the buffer registers appropriately, which is what the IPC framework in L4Re seems to do.

However, the "server" function that initiates this allocation and handles incoming calls was not deallocating these slots upon terminating, and since I am starting and stopping a lot of servers, this accumulated slots rather quickly.

Anyway, I'm now back to debugging concurrency issues, and perhaps other resource issues, once again. But it was useful to review this particular issue and to try and venture beyond a basic proof of concept towards something more robust.

Paul

Paul Boddie

24 Feb 24 Feb

1:05 a.m.

On Thursday, 11 February 2021 01:32:58 CET Paul Boddie wrote:

...

Anyway, I'm now back to debugging concurrency issues, and perhaps other resource issues, once again. But it was useful to review this particular issue and to try and venture beyond a basic proof of concept towards something more robust.

Sorry for repeating my usual bad habit of following up to myself with questions and observations, but maybe there are some insights to be shared. :-)

I have been exercising my code, discovering issues (of course), and have been seeing interesting behaviour with regard to attaching and detaching dataspaces to and from tasks. My code deliberately does this a lot, with each thread of a "client" task attaching to a distinct dataspace provided via a capability, accessing the dataspace, and then detaching the dataspace (and also releasing the capability).

What I now wonder about is what the region manager does when asked to detach a dataspace and what happens to any memory mappings established within that dataspace's region. Obviously, during the lifetime of the association, map requests will have caused flexpages to be transferred to the accessing task, and thus virtual memory mappings will be defined within the dataspace region.

My observations indicate that a dataspace may be associated with a particular base address, but if the dataspace is detached and then another is attached later, the same base address may be associated with the new dataspace. Intuitively, this should be expected (the region manager is merely reusing address space) and not be problematic. After all, the new dataspace is distinct from the old one, and the expectation should be that any traces of the old one should have been removed.

However, my observations also seem to indicate that having attached such a new dataspace, accesses may proceed without any page faults occurring (and thus without any map requests being made), with the accessed data being that made available by the previous dataspace. You can probably understand now why I am doing all this testing!

So, does the region manager request the invalidation of virtual memory mappings when dataspaces are detached, or is it the job of the task providing the dataspace to unmap the regions involved? Or is there a function that I have overlooked to make the region manager invalidate mappings?

(Currently, I just use l4re_rm_detach, but l4re_rm_detach_unmap seems to be equivalent when the current task is indicated. My dataspaces will unmap memory but only if it is not exported by another dataspace, since there are several in existence at any given time.)

Not detaching dataspaces prevents these data conflicts and suggests that the problem has something to do with dataspace management and not necessarily some implementation issue within the dataspace itself. However, not detaching dataspaces is obviously not a solution.

As always, any helpful suggestions or recommendations are very welcome!

Paul

Philipp Eppelt

11:40 a.m.

Hi Paul,

I'm always happy to read from you.

Dataspaces and regions are two different things. The Dataspace is a piece of memory, which you can divide up and map to different places. You tell the region manager the goto-dataspace in case of a page fault and the region manager will map a piece (aka region) from the dataspace to the faulting address.

You can have multiple regions in the region manager referring to the same dataspace as backing memory. The region manager stores a start and size in your virtual address space, a dataspace cap and an offset into this dataspace, which corresponds to the start of your region.

Of course you can have multiple entries in the region mapper referring to the same dataspace and even the same offset.

Now the question is: What happens if you detach a region from the region mapper? You need to specify just the address or the address and a size to detach a single region. In case of just an address you can use any address within a region to unmap it. So this affects just a single region entry in the region manager. In case of the address-size call you can detach multiple regions at once. In the process, the region's memory is unmapped from your local address space.

So if I understand your case correctly, you attach a dataspace to some region (a1, s1) -> (d1, o1) and later unmap this single region. This should unmap all mappings between a1 and a1+s1. However, if there are other regions attached, e.g. (a2, s2) -> (d1, o2), this will still remain and as soon as you unmap the d1-capability, you have stale entries in your region map.

To ensure you detach all regions in an area you can use detach(start, size, ds, task), however, I just noticed there is no C-version provided for the corresponding C++ interface.

To answer your question: If you invoke detach on a specific region, all mappings in this and only this region are unmapped. Other regions served by the same dataspace are not touched.

I'm still curious about your setup, because your experience seems to differ. Can you give me some info about your attach calls?

I hope this sheds some light and my assumptions on what you are doing are not too far fetched.

Cheers, Philipp

On 2/24/21 1:05 AM, Paul Boddie wrote:

...

On Thursday, 11 February 2021 01:32:58 CET Paul Boddie wrote:

...
Anyway, I'm now back to debugging concurrency issues, and perhaps other resource issues, once again. But it was useful to review this particular issue and to try and venture beyond a basic proof of concept towards something more robust.

Sorry for repeating my usual bad habit of following up to myself with questions and observations, but maybe there are some insights to be shared. :-)

I have been exercising my code, discovering issues (of course), and have been seeing interesting behaviour with regard to attaching and detaching dataspaces to and from tasks. My code deliberately does this a lot, with each thread of a "client" task attaching to a distinct dataspace provided via a capability, accessing the dataspace, and then detaching the dataspace (and also releasing the capability).

What I now wonder about is what the region manager does when asked to detach a dataspace and what happens to any memory mappings established within that dataspace's region. Obviously, during the lifetime of the association, map requests will have caused flexpages to be transferred to the accessing task, and thus virtual memory mappings will be defined within the dataspace region.

My observations indicate that a dataspace may be associated with a particular base address, but if the dataspace is detached and then another is attached later, the same base address may be associated with the new dataspace. Intuitively, this should be expected (the region manager is merely reusing address space) and not be problematic. After all, the new dataspace is distinct from the old one, and the expectation should be that any traces of the old one should have been removed.

However, my observations also seem to indicate that having attached such a new dataspace, accesses may proceed without any page faults occurring (and thus without any map requests being made), with the accessed data being that made available by the previous dataspace. You can probably understand now why I am doing all this testing!

So, does the region manager request the invalidation of virtual memory mappings when dataspaces are detached, or is it the job of the task providing the dataspace to unmap the regions involved? Or is there a function that I have overlooked to make the region manager invalidate mappings?

(Currently, I just use l4re_rm_detach, but l4re_rm_detach_unmap seems to be equivalent when the current task is indicated. My dataspaces will unmap memory but only if it is not exported by another dataspace, since there are several in existence at any given time.)

Not detaching dataspaces prevents these data conflicts and suggests that the problem has something to do with dataspace management and not necessarily some implementation issue within the dataspace itself. However, not detaching dataspaces is obviously not a solution.

As always, any helpful suggestions or recommendations are very welcome!

Paul

l4-hackers mailing list l4-hackers@os.inf.tu-dresden.de http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers

-- philipp.eppelt@kernkonzept.com - Tel. 0351-41 883 221 http://www.kernkonzept.com Kernkonzept GmbH. Sitz: Dresden. Amtsgericht Dresden, HRB 31129. Geschäftsführer: Dr.-Ing. Michael Hohmuth

Paul Boddie

10:10 p.m.

On Wednesday, 24 February 2021 11:40:15 CET Philipp Eppelt wrote:

...

I'm always happy to read from you.

That's nice to know!

...

Dataspaces and regions are two different things. The Dataspace is a piece of memory, which you can divide up and map to different places. You tell the region manager the goto-dataspace in case of a page fault and the region manager will map a piece (aka region) from the dataspace to the faulting address.

This seems to be consistent with my current understanding. When I talk about detaching dataspaces, it is just my lazy shorthand for talking about removing the association between the memory region situated at a given base address and the object acting as a dataspace.

Of course, upon a page fault, the dataspace presents a flexpage to the region manager to satisfy an access for some, if not all, of the region associated with the dataspace. Here, I interpret the span of any given flexpage as being part of a region, with an entire region corresponding to the entire span of memory associated with a dataspace. I hope that is correct.

...

You can have multiple regions in the region manager referring to the same dataspace as backing memory. The region manager stores a start and size in your virtual address space, a dataspace cap and an offset into this dataspace, which corresponds to the start of your region.

OK. In my case, I do have multiple regions but they tend to reference different dataspaces. However, these dataspaces may "export" the same memory pages. Something like this:

(a1, s1) -> d1 -> mem[o1:o1+s1] (a2, s2) -> d2 -> mem[o2:o2+s2]

Here, d1 and d2 are provided by the same task. Offsets o1 and o2 are configured for the dataspaces in advance. Obviously, any offsets within the mapped regions are applied within the defined memory ranges.

I don't actually specify offsets into dataspaces when attaching them, but I could imagine that this would just change the offset parameter involved in map requests. This is different to my strategy where each dataspace is configured to export a particular region of something resembling a file.

...

Of course you can have multiple entries in the region mapper referring to the same dataspace and even the same offset.

I feel that this might be a useful alternative to having multiple dataspaces providing different "spans" within a file.

...

Now the question is: What happens if you detach a region from the region mapper? You need to specify just the address or the address and a size to detach a single region. In case of just an address you can use any address within a region to unmap it. So this affects just a single region entry in the region manager. In case of the address-size call you can detach multiple regions at once. In the process, the region's memory is unmapped from your local address space.

If I understand this correctly, detaching a region that is being serviced by a dataspace will cause the virtual memory associations for addresses within that region to be invalidated. And just specifying the base address of that region should be sufficient to detach the region in its entirety.

...

So if I understand your case correctly, you attach a dataspace to some region (a1, s1) -> (d1, o1) and later unmap this single region. This should unmap all mappings between a1 and a1+s1.

This is what I would expect, yes. I detach the region...

l4re_rm_detach(a1)

...or...

l4re_rm_detach_unmap(a1, L4RE_THIS_TASK_CAP)

...and I also call l4re_util_cap_free_um on the dataspace capability. However, I am not completely familiar with the difference between l4re_util_cap_free and l4_util_cap_free_um.

...

However, if there are other regions attached, e.g. (a2, s2) -> (d1, o2), this will still remain and as soon as you unmap the d1-capability, you have stale entries in your region map.

What happens when a task tries to access the memory within a2 to a2+s2? Are there virtual memory associations that may still provide access to the memory exported by the now-unmapped capability?

...

To ensure you detach all regions in an area you can use detach(start, size, ds, task), however, I just noticed there is no C-version provided for the corresponding C++ interface.

For my purposes, the size parameter would not be required if there is still a way to detach the entire region.

...

To answer your question: If you invoke detach on a specific region, all mappings in this and only this region are unmapped. Other regions served by the same dataspace are not touched.

My probably-incomplete mental model is that when my dataspaces send flexpage items in response to map requests, there is something that establishes a specific association between the eventual region and the memory being exported by a dataspace. So, in response to...

map(offset, hot_spot, flags)

...we somehow end up with...

(a1+offset, flexpage_size) -> mem[o1+offset:o1+offset+flexpage_size]

And I guess that there is a specific way for the region manager to request that this mapping be removed without other mappings to the same memory being affected.

(Meanwhile, I know from previous discussions and experimentation that I can indeed invalidate mappings for a given area of exported memory from within a dataspace, affecting all involved regions.)

...

I'm still curious about your setup, because your experience seems to differ. Can you give me some info about your attach calls?

I tend to perform the attach operation as follows:

l4re_rm_attach(addr, size, L4RE_RM_SEARCH_ADDR, ds, 0, L4_PAGESHIFT)

This then yields a suitable address, as I would expect. So, I attach regions and eventually end up with something like this:

(a1, s1) -> d1 -> mem[o1:o1+s1]

Then I detach the region and attach another, getting the same base address:

(a1, s2) -> d2 -> mem[o2:o2+s2]

But it seems as if the memory accessible via the new region still exposes pages via the old mapping. This does not happen all the time, but only under certain conditions that I am trying to characterise.

I also saw it with a region that overlapped the old one instead of having precisely the same base address:

(a1+0x1000, s2) -> d2 -> mem[o2:o2+s2]

Here, an access to the new base of a1+0x1000 appeared to expose mem[o1+0x1000] instead of mem[o2].

...

I hope this sheds some light and my assumptions on what you are doing are not too far fetched.

I worry that what I am doing is actually the thing that is far fetched! It remains possible, as always, that I am doing something wrong and overlooking it. At the same time, I want to make sure that my understanding and expectations are correct.

Thanks for following up!

Paul

Philipp Eppelt

1 Mar 1 Mar

9:30 p.m.

Hi Paul,

sorry for the long wait, I didn't find time last Friday.

On 2/24/21 10:10 PM, Paul Boddie wrote:

...

On Wednesday, 24 February 2021 11:40:15 CET Philipp Eppelt wrote:

...
I'm always happy to read from you.

That's nice to know!

...
Dataspaces and regions are two different things. The Dataspace is a piece of memory, which you can divide up and map to different places. You tell the region manager the goto-dataspace in case of a page fault and the region manager will map a piece (aka region) from the dataspace to the faulting address.

This seems to be consistent with my current understanding. When I talk about detaching dataspaces, it is just my lazy shorthand for talking about removing the association between the memory region situated at a given base address and the object acting as a dataspace.

Of course, upon a page fault, the dataspace presents a flexpage to the region manager to satisfy an access for some, if not all, of the region associated with the dataspace. Here, I interpret the span of any given flexpage as being part of a region, with an entire region corresponding to the entire span of memory associated with a dataspace. I hope that is correct.

Yes, correct. Just details: The dataspace is not the acting entity on a page fault, the region manager is. It calls ds->map(...) on the dataspace registered for the pf-address.

...

...
You can have multiple regions in the region manager referring to the same dataspace as backing memory. The region manager stores a start and size in your virtual address space, a dataspace cap and an offset into this dataspace, which corresponds to the start of your region.

OK. In my case, I do have multiple regions but they tend to reference different dataspaces. However, these dataspaces may "export" the same memory pages. Something like this:

(a1, s1) -> d1 -> mem[o1:o1+s1] (a2, s2) -> d2 -> mem[o2:o2+s2]

Here, d1 and d2 are provided by the same task. Offsets o1 and o2 are configured for the dataspaces in advance. Obviously, any offsets within the mapped regions are applied within the defined memory ranges.

I don't actually specify offsets into dataspaces when attaching them, but I could imagine that this would just change the offset parameter involved in map requests. This is different to my strategy where each dataspace is configured to export a particular region of something resembling a file.

Sounds about right.

...

...
Of course you can have multiple entries in the region mapper referring to the same dataspace and even the same offset.

I feel that this might be a useful alternative to having multiple dataspaces providing different "spans" within a file.>

...
Now the question is: What happens if you detach a region from the region mapper? You need to specify just the address or the address and a size to detach a single region. In case of just an address you can use any address within a region to unmap it. So this affects just a single region entry in the region manager. In case of the address-size call you can detach multiple regions at once. In the process, the region's memory is unmapped from your local address space.

If I understand this correctly, detaching a region that is being serviced by a dataspace will cause the virtual memory associations for addresses within that region to be invalidated. And just specifying the base address of that region should be sufficient to detach the region in its entirety.

Yes.

...

...
So if I understand your case correctly, you attach a dataspace to some region (a1, s1) -> (d1, o1) and later unmap this single region. This should unmap all mappings between a1 and a1+s1.

This is what I would expect, yes. I detach the region...

l4re_rm_detach(a1)

...or...

l4re_rm_detach_unmap(a1, L4RE_THIS_TASK_CAP)

Yes, that's equivalent. In both cases the used cap's should be valid (otherwise something is seriously wrong), and then the whole region is removed from your tasks virtual address space.

...

...and I also call l4re_util_cap_free_um on the dataspace capability. However, I am not completely familiar with the difference between l4re_util_cap_free and l4_util_cap_free_um.

The one returns the capability index to the capability allocator, the other returns the capability index and unmaps the object that the capability referenced from the local object space. Thus making it inaccessible for all capabilities in this task, even if they still reference this object. E.g. if 5, 7, and 9 reference and object O, and you call l4_util_cap_free_um(9), 5 and 7 will get an L4_IPC_ENOT_EXISTENT error when they try to access it.

...

...
However, if there are other regions attached, e.g. (a2, s2) -> (d1, o2), this will still remain and as soon as you unmap the d1-capability, you have stale entries in your region map.

What happens when a task tries to access the memory within a2 to a2+s2? Are there virtual memory associations that may still provide access to the memory exported by the now-unmapped capability?

This I actually don't know. I'll investigate. I hope the mappings are gone and you'll get a page fault, though.

...

...
To answer your question: If you invoke detach on a specific region, all mappings in this and only this region are unmapped. Other regions served by the same dataspace are not touched.

My probably-incomplete mental model is that when my dataspaces send flexpage items in response to map requests, there is something that establishes a specific association between the eventual region and the memory being exported by a dataspace. So, in response to...

map(offset, hot_spot, flags)

...we somehow end up with...

(a1+offset, flexpage_size) -> mem[o1+offset:o1+offset+flexpage_size]

And I guess that there is a specific way for the region manager to request that this mapping be removed without other mappings to the same memory being affected.

I think the region mapper only operates on region granularity. So if you call detach on any address between [a1, a1+s1] all the region and its memory is gone. As said before, if you have several regions attached from the same DS, just the referenced one is gone; the others are untouched.

...

(Meanwhile, I know from previous discussions and experimentation that I can indeed invalidate mappings for a given area of exported memory from within a dataspace, affecting all involved regions.)

...
I'm still curious about your setup, because your experience seems to differ. Can you give me some info about your attach calls?

I tend to perform the attach operation as follows:

l4re_rm_attach(addr, size, L4RE_RM_SEARCH_ADDR, ds, 0, L4_PAGESHIFT)

This then yields a suitable address, as I would expect. So, I attach regions and eventually end up with something like this:

(a1, s1) -> d1 -> mem[o1:o1+s1]

Then I detach the region and attach another, getting the same base address:

(a1, s2) -> d2 -> mem[o2:o2+s2]

But it seems as if the memory accessible via the new region still exposes pages via the old mapping. This does not happen all the time, but only under certain conditions that I am trying to characterise.

I also saw it with a region that overlapped the old one instead of having precisely the same base address:

(a1+0x1000, s2) -> d2 -> mem[o2:o2+s2]

Here, an access to the new base of a1+0x1000 appeared to expose mem[o1+0x1000] instead of mem[o2].

Are you certain that d1 and d2 are actually different dataspaces? Are you getting only d1 data or only d2 data? Are you getting a mix of d1 and d2 data?

Let me summarize the steps I think are necessary during the lifetime of the dataspace: * Allocate a capability index for the dataspace * Allocate the memory and receive the dataspace capability in the allocated index (see http://l4re.org/doc/classL4Re_1_1Mem__alloc.html#a44b301573ae859e8406400338c...) or something alike to get the mapping for the dataspace capability under the allocated capability index. (to be sure use: http://l4re.org/doc/group__l4__task__api.html#ga829a1b5cb4d5dba33ffee57534a5...) * attach the dataspace to the region manager * <use region/memory> * detach region from the region manager * unmap the dataspace capability using l4_task_unmap() * return the capability index to the capability allocator.

The last two steps are done by l4re_util_cap_free_um.

Hopefully, this helps you as a baseline. I'm a bit puzzled by the mem[o1+0x1000] case. I went through the code and I don't see how this can happen unless the "task" capability given to l4re_rm_detach_unmap is invalid, however, l4re_rm_detach is using the correct capability. Which code version are you working on? Maybe I'm looking at the wrong code?

Cheers, Philipp

-- philipp.eppelt@kernkonzept.com - Tel. 0351-41 883 221 http://www.kernkonzept.com Kernkonzept GmbH. Sitz: Dresden. Amtsgericht Dresden, HRB 31129. Geschäftsführer: Dr.-Ing. Michael Hohmuth

Paul Boddie

2 Mar 2 Mar

12:42 a.m.

On Monday, 1 March 2021 21:30:11 CET Philipp Eppelt wrote:

...

sorry for the long wait, I didn't find time last Friday.

No worries: I very much appreciate the reply!

...

On 2/24/21 10:10 PM, Paul Boddie wrote:

...
Of course, upon a page fault, the dataspace presents a flexpage to the region manager to satisfy an access for some, if not all, of the region associated with the dataspace. Here, I interpret the span of any given flexpage as being part of a region, with an entire region corresponding to the entire span of memory associated with a dataspace. I hope that is correct.

Yes, correct. Just details: The dataspace is not the acting entity on a page fault, the region manager is. It calls ds->map(...) on the dataspace registered for the pf-address.

Yes, the fault occurs in the task, and the region manager is acting on behalf of the task to request a mapping from the dataspace. Again, as I hopefully understand things.

[...]

...

...
This is what I would expect, yes. I detach the region...

l4re_rm_detach(a1)

...or...

l4re_rm_detach_unmap(a1, L4RE_THIS_TASK_CAP)

Yes, that's equivalent. In both cases the used cap's should be valid (otherwise something is seriously wrong), and then the whole region is removed from your tasks virtual address space.

Understood.

...

...
...and I also call l4re_util_cap_free_um on the dataspace capability. However, I am not completely familiar with the difference between l4re_util_cap_free and l4_util_cap_free_um.

The one returns the capability index to the capability allocator, the other returns the capability index and unmaps the object that the capability referenced from the local object space. Thus making it inaccessible for all capabilities in this task, even if they still reference this object. E.g. if 5, 7, and 9 reference and object O, and you call l4_util_cap_free_um(9), 5 and 7 will get an L4_IPC_ENOT_EXISTENT error when they try to access it.

In my case, then, I imagine that I will almost always want to unmap the object.

...

...
...
However, if there are other regions attached, e.g. (a2, s2) -> (d1, o2), this will still remain and as soon as you unmap the d1-capability, you have stale entries in your region map.

What happens when a task tries to access the memory within a2 to a2+s2? Are there virtual memory associations that may still provide access to the memory exported by the now-unmapped capability?

This I actually don't know. I'll investigate. I hope the mappings are gone and you'll get a page fault, though.

So do I. :-)

[Strange behaviour]

...

...
I also saw it with a region that overlapped the old one instead of having precisely the same base address:

(a1+0x1000, s2) -> d2 -> mem[o2:o2+s2]

Here, an access to the new base of a1+0x1000 appeared to expose mem[o1+0x1000] instead of mem[o2].

Are you certain that d1 and d2 are actually different dataspaces? Are you getting only d1 data or only d2 data? Are you getting a mix of d1 and d2 data?

It is, of course, always possible that I have been making a mistake - this being the usual discovery when I report strange behaviour - but the means of acquiring dataspaces d1 and d2 may involve distinct objects, and it involves creating further distinct objects to act as dataspaces. So, something like this would occur:

d1 = c1.open() d2 = c2.open()

Here, c1 and c2 may even be the same object, but even then they should still allocate a new object for each invocation of the open operation, yielding two distinct dataspaces d1 and d2.

What I would observe is d1 data even after d2 was attached. I was somewhat confused as to whether d1 might still be active or not. But if it is, then d2 should not be allocated an address region coinciding with that of d1. If it isn't, then d2 should be unaffected by whatever d1 had been doing.

...

Let me summarize the steps I think are necessary during the lifetime of the dataspace:

Allocate a capability index for the dataspace

Allocate the memory and receive the dataspace capability in the

allocated index (see http://l4re.org/doc/classL4Re_1_1Mem__alloc.html#a44b301573ae859e8406400338c c8e924) or something alike to get the mapping for the dataspace capability under the allocated capability index. (to be sure use: http://l4re.org/doc/group__l4__task__api.html#ga829a1b5cb4d5dba33ffee57534a5 05af)

Do I need to use the memory allocation interface if the dataspace is sending flexpage items? I have previously used the l4re_ma functions (and possibly C++ equivalents) to allocate memory, but this was mostly useful for device drivers where physical addresses may need to be obtained for hardware peripheral usage, plus convenient sharing of entire memory regions between tasks without any of my tasks needing to act as dataspaces.

My strategy with this work is to implement paging by sending flexpage items to satisfy paging requests and thus provide a dataspace implementation. In the dataspace itself, I actually use posix_memalign to obtain memory, but that is ultimately going to be using l4re_ma functions at the lowest level, I imagine.

...

attach the dataspace to the region manager

<use region/memory>

detach region from the region manager

unmap the dataspace capability using l4_task_unmap()

return the capability index to the capability allocator.

The last two steps are done by l4re_util_cap_free_um.

The other steps are consistent with my approach.

...

Hopefully, this helps you as a baseline. I'm a bit puzzled by the mem[o1+0x1000] case. I went through the code and I don't see how this can happen unless the "task" capability given to l4re_rm_detach_unmap is invalid, however, l4re_rm_detach is using the correct capability. Which code version are you working on? Maybe I'm looking at the wrong code?

I'm still using the Subversion distribution (version 83) of L4Re. I know I should be following the different GitHub repositories but I find the Subversion distribution more convenient and I have not wanted to introduce too many different variables in my own experiments. Plus, it seems to be reliable enough for my needs.

Over the weekend, I tried to troubleshoot this issue and investigate the nature of it. I then retraced my steps, introducing wrapper functions around l4re_rm_attach and l4re_rm_detach to see if the region manager was giving out duplicate addresses. This seemed to indicate that it was indeed doing so. If I introduced synchronisation around the l4re_rm calls (effectively extending the synchronisation already in place around the STL data structure recording active regions), the observed problem went away.

Now, this is not consistent with what Christian wrote a few weeks ago, where he also noted that the capability slot allocator is not thread-safe, but I imagine that either my own code somehow uses the region manager API in a thread-unsafe way (although I cannot see exactly how that might be) or there is some element of using this API where a degree of "thread unsafety" exists. So, I have just added synchronisation around both the capability slot allocator and the region manager operations.

At some point in hopefully not too long, I aim to bundle this work up once again (since it exists in a much more rough state from an earlier iteration) and then anyone suitably interested can see what I have been doing wrong all along. For now, though, I hope that I may be able to continue to work around whatever the problem might be.

I hope these observations are at the very least informative, if not particularly helpful.

Thanks once again for your advice!

Paul

Philipp Eppelt

10:38 a.m.

Hi Paul,

On 3/2/21 12:42 AM, Paul Boddie wrote:

...

On Monday, 1 March 2021 21:30:11 CET Philipp Eppelt wrote:

...
On 2/24/21 10:10 PM, Paul Boddie wrote:

[...]

...

...
...
...
However, if there are other regions attached, e.g. (a2, s2) -> (d1, o2), this will still remain and as soon as you unmap the d1-capability, you have stale entries in your region map.

What happens when a task tries to access the memory within a2 to a2+s2? Are there virtual memory associations that may still provide access to the memory exported by the now-unmapped capability?

This I actually don't know. I'll investigate. I hope the mappings are gone and you'll get a page fault, though.

So do I. :-)

[Strange behaviour]

This was actually wrong. So assume you get a DS capability from some other task. Then your use of the DS cap to get mappings from the dataspace - either through rm->attach() and page faults or direct ds->map() calls. As a result you have several mappings in your task.

Then you unmap the DS cap from your object space. And ... nothing happens or does it? You might still have access to the memory mappings or you might not. You didn't unmap the memory from your address space, but someone else, the dataspace provider might destroy the DS in it's address space and unmap the corresponding memory, leading to removal of the memory in all other tasks (aka remove the branch from the mapping tree).

Applied to our example above: * l4re_rm_detach(a1): (a1, s1) -> (d1, o1) is gone. * free_um(d1) * the region map still contains (a2, s2) -> (d1, o2): page faults will fail, but if the memory was already mapped, it might be still there.

...

...
...
I also saw it with a region that overlapped the old one instead of having precisely the same base address:

(a1+0x1000, s2) -> d2 -> mem[o2:o2+s2]

Here, an access to the new base of a1+0x1000 appeared to expose mem[o1+0x1000] instead of mem[o2].

Are you certain that d1 and d2 are actually different dataspaces? Are you getting only d1 data or only d2 data? Are you getting a mix of d1 and d2 data?

It is, of course, always possible that I have been making a mistake - this being the usual discovery when I report strange behaviour - but the means of acquiring dataspaces d1 and d2 may involve distinct objects, and it involves creating further distinct objects to act as dataspaces. So, something like this would occur:

d1 = c1.open() d2 = c2.open()

Here, c1 and c2 may even be the same object, but even then they should still allocate a new object for each invocation of the open operation, yielding two distinct dataspaces d1 and d2.

What I would observe is d1 data even after d2 was attached. I was somewhat confused as to whether d1 might still be active or not. But if it is, then d2 should not be allocated an address region coinciding with that of d1. If it isn't, then d2 should be unaffected by whatever d1 had been doing.

...
Let me summarize the steps I think are necessary during the lifetime of the dataspace:

Allocate a capability index for the dataspace

Allocate the memory and receive the dataspace capability in the

allocated index (see http://l4re.org/doc/classL4Re_1_1Mem__alloc.html#a44b301573ae859e8406400338c c8e924) or something alike to get the mapping for the dataspace capability under the allocated capability index. (to be sure use: http://l4re.org/doc/group__l4__task__api.html#ga829a1b5cb4d5dba33ffee57534a5 05af)

Do I need to use the memory allocation interface if the dataspace is sending flexpage items? I have previously used the l4re_ma functions (and possibly C++ equivalents) to allocate memory, but this was mostly useful for device drivers where physical addresses may need to be obtained for hardware peripheral usage, plus convenient sharing of entire memory regions between tasks without any of my tasks needing to act as dataspaces.

My strategy with this work is to implement paging by sending flexpage items to satisfy paging requests and thus provide a dataspace implementation. In the dataspace itself, I actually use posix_memalign to obtain memory, but that is ultimately going to be using l4re_ma functions at the lowest level, I imagine.

No, I used Mem_alloc as an example on how to obtain an actual capability behind your allocated index. If you get the capability mapping by other means, this is fine.

...

...
Hopefully, this helps you as a baseline. I'm a bit puzzled by the mem[o1+0x1000] case. I went through the code and I don't see how this can happen unless the "task" capability given to l4re_rm_detach_unmap is invalid, however, l4re_rm_detach is using the correct capability. Which code version are you working on? Maybe I'm looking at the wrong code?

I'm still using the Subversion distribution (version 83) of L4Re. I know I should be following the different GitHub repositories but I find the Subversion distribution more convenient and I have not wanted to introduce too many different variables in my own experiments. Plus, it seems to be reliable enough for my needs.

No worries, SVN is fine.

...

Over the weekend, I tried to troubleshoot this issue and investigate the nature of it. I then retraced my steps, introducing wrapper functions around l4re_rm_attach and l4re_rm_detach to see if the region manager was giving out duplicate addresses. This seemed to indicate that it was indeed doing so. If I introduced synchronisation around the l4re_rm calls (effectively extending the synchronisation already in place around the STL data structure recording active regions), the observed problem went away.

Now, this is not consistent with what Christian wrote a few weeks ago, where he also noted that the capability slot allocator is not thread-safe, but I imagine that either my own code somehow uses the region manager API in a thread-unsafe way (although I cannot see exactly how that might be) or there is some element of using this API where a degree of "thread unsafety" exists. So, I have just added synchronisation around both the capability slot allocator and the region manager operations.

Thread safety again. Nothing springs to mind, but this is certainly interesting. I'll mull over it a bit.

Cheers Philipp

-- philipp.eppelt@kernkonzept.com - Tel. 0351-41 883 221 http://www.kernkonzept.com Kernkonzept GmbH. Sitz: Dresden. Amtsgericht Dresden, HRB 31129. Geschäftsführer: Dr.-Ing. Michael Hohmuth

1216

Age (days ago)

1239

Last active (days ago)

l4-hackers@os.inf.tu-dresden.de

11 comments

3 participants

tags (0)

participants (3)

Christian Ludwig
Paul Boddie
Philipp Eppelt