On Wed Jan 23, 2008 at 11:39:49 +0100, Marc CHALAND wrote:
2008/1/22, Adam Lackorzynski adam@os.inf.tu-dresden.de:
The thread should exist, looking at the code path in the semaphore lib. The thread_id used for the prio-get is coming fron an ipc-wait so the thread should be there (in the sense there's a thread at all). The thread-lib might have some other state than ACTIVE for this thread.
OK, I agree and I understand :).
If you look in the debugger, the thread is there and its state is fine?
This happens when we do heavy testing. The task has a standard behavior : a manager thread and several worker threads. This message happens on workers. As worker doesn't live long enough, it is diffult for me to catch info about that threads. Perhaps, I should try to modify semaphore and add a enter_kdebug() ?
If it happens, is it a permanent error or just sporadic? Does it happen during some setup phase or after that when things have settled?
This happens only after several minutes of heavy testing and sometimes. As we have only one testing infra on which problem occurs, we will do more investigation after l4rm synchro valid.
On Wed Jan 23, 2008 at 15:04:33 +0100, Marc CHALAND wrote:
Here is some more info about this log : Each time this log appears, the state of the thread into jdb is, for example :
17.0d (deleted) a0 17.01 rcv,ipc_progr
Backtrace of each thread is mainly : l4rm_detach l4th_pages_free __do_cleanup_and_block
We observed one thread with the following backtrace : __modify_region l4rm_detach l4th_pages_free __do_cleanup_and_block
If I understand, the thread is quite dead but it still tries to do semaphore, isn't it ?
No, the thread is still alive, the '(deleted)' just means that it has already been unregistered at the name service. Actually, the thread will also not go away, it will just sleep. So it is basically always there. Could you verify the theory that the threadlib has some strange state at that time? So basically that l4thread_get_prio is returning -L4_EINVAL and then what the value of l4th_tcbs[thread].state in l4th_tcb_get is (include/__tcb.h).
Adam