Hi,
first of all: sorry for my late answer, I've been away...
On Thu, Aug 04, 2005 at 03:09:12PM +0200, Adam Lackorzynski wrote:
On Thu Aug 04, 2005 at 13:45:12 +0200, Rene Wittmann wrote:
I have also the problem that I want to output some of my symbols+values during the execution of my application. Furthermore I'm working in the RT scheduling mode (with APIC one shot mode).
I have three timeslices:
- mandatory
- optional
- "mandatory" (if part of optional was executed)
The preempter recognizes timeslice overruns and sets a global variable, the working thread checks this variable and therefore knows about those "events".
This works fine if my priorites are set to 50,30,50, but fails after a few periods with a pagefault if setting the priorities to 200,190,200.
In each part I use some calls to LOG() to output a few things I'm interested in. If I leave out those LOG()-calls, it works also with the high priorites (200,190,200). So I assume, there's a LOG-buffer overflow (as the LOG priority is not that high)?
Could I prevent this by using printf instead of LOG or is LOG just a kind of macro for printf?
printf is usually also a log function.
The priorities of the logserver are at 32, so in the first example they're inbetween and in the latter one completely below.
Does the PF go away when you add --prio 210 (or similar) as an argument to log?
Yes, it goes away. But I thought that, if you have a real-time application running it will not be influenced by some "time-sharing" application as the logserver should be.
Can you send me a backtrace with unstripped binaries when the PF happens? Thanks.
Sure, I can if you tell me how to do it...
The exact error message is:
app| L4RM: [PF] write at 0x01306f58, eip 0380e48b, src 9.02 app| [9.0] l4rm/lib/src/pagefault.c:78:__unknown_pf(): app| unhandled page fault
And then it enters the kernel debugger. I've compiled my program with "CFLAGS = -g" so it should be unstripped, right?
But when typing "bt<CR>" in the kernel debugger, there are no symbols and the output just looks like this
#4 f000763b #5 f0027332 #6 f000e718 #7 f000627a #8 f000e1fd #9 f000627a #10 f000e1fd #11 f000acb0 #12 f0005f6d #13 f000627a #14 f0006280 #15 f0005c3f #16 f000627a #17 f0006280 #18 f0005aec #19 f999c5f5 #20 f00176ed #21 f0006d54 #22 f0027300 #23 f000cfa5 #24 f0006a5a #25 f000784b #26 f00272dc
So I don't expect any helpful information in there or is it?
BTW: my grub config looks like this:
title llv1_decode (all layers, log locally) root (hd0,2) kernel /drops/bin/x86_586/l4v2/rmgr -sigma0 modaddr 0x08000000 module /l4/kernel/fiasco/build/fiasco_apic -nokdb -nowait -serial -serial_esc module /drops/bin/x86_586/l4v2/sigma0 module /drops/bin/x86_586/l4v2/log_net -net -local -ip 131.188.36.211 -buffer 4096 module /drops/bin/x86_586/l4v2/names module /drops/bin/x86_586/l4v2/dm_phys module /drops/bin/x86_586/l4v2/simple_ts module /drops/bin/x86_586/l4v2/app
OK, this modaddr seems huge, but my data is just linked to the binary and I need this huge value...
So how would one produce a useful backtrace inside fiasco? Or is there a way to use another debugger (gdb?)?
Greetings, Rene
[...]
Yes, it goes away. But I thought that, if you have a real-time application running it will not be influenced by some "time-sharing" application as the logserver should be.
Just a short answer from me: This is of course only true if you don't use these services! How should the system protect you if you call a non-realtime service???
Btw. using the logserver or any other printf-style output inside a real-time loop is a *very* bad idea. This services ususally use very slow output devices (serial line, text-mode display).
As a general rule: You must decouple the output between the output service and you rt-app. Of course, this can not be achieved with synchronous IPC to a non-rt service ...
One would typically use a memory buffer and either collect the data and output it *after* the experiment or use a asynchronous memory protocol to transfer the information to another io service (Note: do not use LOG here as it also slows down the rt-part of the system as it causes serious slowdown in hardware IO (Jork once meassured a lot more than one µs per character !!!).
You might want to have a look in the "rt_mon" or the "grtmon" package, which have been used with some success in similar situations. Both packages should have demo applications available and for grtmon there is a Diploma Thesis (A Generalized Approach to Runtime Monitoring for Real-Time Systems) available from:
http://os.inf.tu-dresden.de/project/finished/finished.xml.de
Hope this helps ...
Martin Pohlack
On Monday 08 August 2005 21:57, Martin Pohlack wrote:
[...]
Yes, it goes away. But I thought that, if you have a real-time application running it will not be influenced by some "time-sharing" application as the logserver should be.
Just a short answer from me: This is of course only true if you don't use these services! How should the system protect you if you call a non-realtime service???
Btw. using the logserver or any other printf-style output inside a real-time loop is a *very* bad idea. This services ususally use very slow output devices (serial line, text-mode display).
As a general rule: You must decouple the output between the output service and you rt-app. Of course, this can not be achieved with synchronous IPC to a non-rt service ...
Ack.
One would typically use a memory buffer and either collect the data and output it *after* the experiment or use a asynchronous memory protocol to transfer the information to another io service (Note: do not use LOG here as it also slows down the rt-part of the system as it causes serious slowdown in hardware IO (Jork once meassured a lot more than one µs per character !!!).
The logserver is normally built with "serial support" which means that it performs output to the serial console itself. Previous implementations of the logserver used the kernel debugging interface which induced the bad behaviour Martin is talking about (because of disabling the interrupts for character output).
The logserver with serial support does not influences real-time applications.
Frank
On Mon Aug 08, 2005 at 20:28:16 +0200, Rene Wittmann wrote:
On Thu, Aug 04, 2005 at 03:09:12PM +0200, Adam Lackorzynski wrote:
On Thu Aug 04, 2005 at 13:45:12 +0200, Rene Wittmann wrote:
I have also the problem that I want to output some of my symbols+values during the execution of my application. Furthermore I'm working in the RT scheduling mode (with APIC one shot mode).
I have three timeslices:
- mandatory
- optional
- "mandatory" (if part of optional was executed)
The preempter recognizes timeslice overruns and sets a global variable, the working thread checks this variable and therefore knows about those "events".
This works fine if my priorites are set to 50,30,50, but fails after a few periods with a pagefault if setting the priorities to 200,190,200.
In each part I use some calls to LOG() to output a few things I'm interested in. If I leave out those LOG()-calls, it works also with the high priorites (200,190,200). So I assume, there's a LOG-buffer overflow (as the LOG priority is not that high)?
Could I prevent this by using printf instead of LOG or is LOG just a kind of macro for printf?
printf is usually also a log function.
The priorities of the logserver are at 32, so in the first example they're inbetween and in the latter one completely below.
Does the PF go away when you add --prio 210 (or similar) as an argument to log?
Yes, it goes away. But I thought that, if you have a real-time application running it will not be influenced by some "time-sharing" application as the logserver should be.
Well, if you use it... see Martins reply...
Can you send me a backtrace with unstripped binaries when the PF happens? Thanks.
Sure, I can if you tell me how to do it...
The exact error message is:
app| L4RM: [PF] write at 0x01306f58, eip 0380e48b, src 9.02 app| [9.0] l4rm/lib/src/pagefault.c:78:__unknown_pf(): app| unhandled page fault
This message alone doesn't help me much as I need to know where in terms of source code it happened. The EIP alone is only useful to find the position in the binary.
And then it enters the kernel debugger. I've compiled my program with "CFLAGS = -g" so it should be unstripped, right?
Binaries are stripped on install, configured in the configuration menu in the l4 directory. Just disable it there and relink your binary.
But when typing "bt<CR>" in the kernel debugger, there are no symbols and the output just looks like this
#4 f000763b #5 f0027332 #6 f000e718 #7 f000627a #8 f000e1fd #9 f000627a #10 f000e1fd #11 f000acb0 #12 f0005f6d #13 f000627a #14 f0006280 #15 f0005c3f #16 f000627a #17 f0006280 #18 f0005aec #19 f999c5f5 #20 f00176ed #21 f0006d54 #22 f0027300 #23 f000cfa5 #24 f0006a5a #25 f000784b #26 f00272dc
So I don't expect any helpful information in there or is it?
Indeed. ;)
And additionally this is a kernel backtrace, I want to see the other one.
BTW: my grub config looks like this:
title llv1_decode (all layers, log locally) root (hd0,2) kernel /drops/bin/x86_586/l4v2/rmgr -sigma0 modaddr 0x08000000 module /l4/kernel/fiasco/build/fiasco_apic -nokdb -nowait -serial -serial_esc module /drops/bin/x86_586/l4v2/sigma0 module /drops/bin/x86_586/l4v2/log_net -net -local -ip 131.188.36.211 -buffer 4096 module /drops/bin/x86_586/l4v2/names module /drops/bin/x86_586/l4v2/dm_phys module /drops/bin/x86_586/l4v2/simple_ts module /drops/bin/x86_586/l4v2/app
OK, this modaddr seems huge, but my data is just linked to the binary and I need this huge value...
If it works, fine.
So how would one produce a useful backtrace inside fiasco? Or is there a way to use another debugger (gdb?)?
You can also use 'objdump -ldS app' and post the code around the pagefault EIP, so you know where it happens. But a backtrace is useful too.
Adam
printf is usually also a log function.
The priorities of the logserver are at 32, so in the first example they're inbetween and in the latter one completely below.
Does the PF go away when you add --prio 210 (or similar) as an argument to log?
Yes, it goes away. But I thought that, if you have a real-time application running it will not be influenced by some "time-sharing" application as the logserver should be.
Well, if you use it... see Martins reply...
OK, so I thought the LOG-Server was exactly the thing I was searching for, I didn't expect that LOG() uses _synchronous_ IPCs. But that explains the behaviour.
As Martin said: it's really slow with logging, so I just have a buffer storing my results and that works fine for me.
I was using the LOG-functions because they are also used in a RT-example in your "hello"-package...
Can you send me a backtrace with unstripped binaries when the PF happens? Thanks.
Sure, I can if you tell me how to do it...
The exact error message is:
app| L4RM: [PF] write at 0x01306f58, eip 0380e48b, src 9.02 app| [9.0] l4rm/lib/src/pagefault.c:78:__unknown_pf(): app| unhandled page fault
This message alone doesn't help me much as I need to know where in terms of source code it happened. The EIP alone is only useful to find the position in the binary.
So how would one produce a useful backtrace inside fiasco? Or is there a way to use another debugger (gdb?)?
You can also use 'objdump -ldS app' and post the code around the pagefault EIP, so you know where it happens. But a backtrace is useful too.
Thanks for your helpful reply: I found the position where the error resides: in one of my functions (trying to write outside the boundaries of my array). The strange thing is that it only occurs on VMware. When I use the LOG functions, it slowed down and mixed totally up. When testing on real hardware the error didn't occur, even when setting the periods absurdly small. So thanks Adam for the hints. And sorry for bothering you ;-)
Regards, Rene
l4-hackers@os.inf.tu-dresden.de