On Monday, 11 November 2024 10:49:50 CET Philipp Eppelt wrote:
Yes. The l4re_kernel has a static 64K heap for its own allocations. So far this has been enough and when it is exhausted, it's usually a sign of something going wrong somewhere else.
As indeed was the case here!
How long is your application running? Is it doing a lot of (small) allocations and deallocations?
The application is a filesystem server that should run indefinitely, although I found that I would run out of heap after maybe tens of thousands of allocations of the offending object.
The l4re_kernel maintains the memory map for your application. For each entry in this map memory from the heap is used. So in case of a lot of small allocations this map grows and might use all of the heap. However, I would also expect a visible slowdown in your application on each memory de/allocation. Are you seeing any of this?
In fact, I didn't notice any slowdown, but this is an interesting detail that is certainly worth further consideration.
Are there any convenient ways of monitoring memory allocation in L4Re?
From the top of my head: Try the `l4re_dbg` flags: Add `l4re_dbg = 0xff` to your application startup in ned as parameter next to the caps table. example: https://github.com/kernkonzept/mk/blob/master/conf/examples/x86-fb.cfg#L25
As it has been a few months since I have been able to look at L4Re in any depth, I had already forgotten about the debug flags...
You can also trigger a `debug_dump` of the l4re_kernel's memory map via the Debug_obj::debug() function. I'd try `cap_cast`ing the `L4Re::Env::env()->rm()`/l4re_kernel cap to `L4Re::Debug_obj` and the call the `debug()` function. The `function` parameter is unused in the l4re_kernel (l4re-core/l4re_kernel/server/src/region.cc), so zero or a dummy function should be ok.
...as well as the debug function!
So it is a way, if it's a convenient one? Well, .. ;-)
Well, firstly I appreciate your response, reminding me as always of the different options for investigating problems. What I found mildly helpful was to use the rather more mundane malloc_stats function just to see how quickly the heap was being exhausted, inserting calls before and after the functionality that I suspected of misbehaviour.
Although that didn't help identify the cause directly, it did prompt me to review the parts of my code where I was allocating memory, and it naturally led me to a place where I had introduced what might be called a "convenience" allocation in an object that can be given a memory pool to use, but where a default pool can be created if no pool is supplied.
Of course, it helps to deallocate any default pool when the object goes away, but this had been overlooked in my broader efforts to prototype the system and get something working. Introducing the corresponding deallocation seems to have eliminated the main (and hopefully only) culprit.
As part of this exercise, I briefly looked into the AddressSanitizer and LeakSanitizer functionality that should be available in gcc. [1] However, it seems that the Debian compilers, at least, are not able to introduce the sanitizer functionality into compiled binaries, perhaps due to my use of static binaries, and so I get linker errors when using the appropriate build flags. [2]
Although this isn't of immediate interest, I did wonder if the L4Re developers had not already evaluated such functionality and might already be using it in some sense.
Thanks once again for responding to my queries!
Paul
[1] https://github.com/google/sanitizers
[2] https://www.osc.edu/resources/getting_started/howto/ howto_use_address_sanitizer
_______________________________________________ l4-hackers mailing list -- l4-hackers@os.inf.tu-dresden.de To unsubscribe send an email to l4-hackers-leave@os.inf.tu-dresden.de