On Mon, 27 Oct 2008 01:06:43 +0100, Adam Lackorzynski wrote:
On Thu Oct 23, 2008 at 21:54:08 +1300, Valery V. Sedletski wrote:
On Wed, 22 Oct 2008 19:12:59 +0200, Adam Lackorzynski wrote:
...skipped...
So, this is iret instruction somewhere on return from a syscall, if I understood correctly. Maybe, my Fiasco config is wrong?
Rather not. Can you tell me at which position the f.2 is in user-land? That would be interesting.
This is the result of executing "tf.2" in kernel debugger:
thread: f.02 <001e0801> prio: a0 mcp: ff mode: Con state: 001 ready %dl 001e0804 into wait for: polling: rcv descr: lcked by: ---.-- timeout : cpu time: 2.000 ms timeslice: 4000/10000 ╣s pager : f.00 cap: ---.-- utcb: ffe34200 preemptr: 0.00 not monitored ready lnk: ???.?? 5.00 prsnt lnk: f.04 f.00 EAX=00000000 ESI=001e0801 DS=0023 EBX=a0051454 EDI=00000000 ES=0023 iret ECX=0000000d EBP=afeffe40 GS=0043 lea 0x0(%esi),%esi EDX=afeffe58 ESP=c03c17ec SS=0010 trap 13 (General Protection), error 00000070, from kernel mode CS=0008 EIP=f003622b EFlags=00203246 001e0815 cli c03c17ec f003622b 00000008 00203246 [eacff102] 00000073 00203296 afeffe00 0000007b 001e0818 into
the [eacff102] dword was highlighted in backtrace. So, as I understood this, this marks current stack frame. Here we see that previous stack frame is ip=0xf003622b, cs=0x8, eflags=0x00203246 is respectively ip, cs and eflags taken off from the stack by previous iret instruction. Current iret must take off the next frame: eip=0xeacff102, cs=0x73, eflags=0x00203296. Here we can see that selector cs=0x73 is ring3 and eip is 0xeacff102.
And also, unassembly at eip=0xeacff102:
eacff100 int $0x31 eacff102 ret
-- An entering the syscall?
<001e0801> -- What is it? If it is an address from where int 0x32 is called then instructions at this point are:
001e07fe dec %dl 001e0800 into 001e0801 cli 001e0802 dec %dl 001e0804 into 001e0805 cli
an address 001e0800 (into) is not in CS segment of vmlinux executable (no such address in objdump output) So, what is it? Corrupt stack? Or it is not a return address but something different?
X is very demanding wrt permissions. More recent version should behave better though.
And the question: what does cli instruction do? I guess this must trigger an I/O page fault for I/O pages to be mapped? Am I right?
CLI disables interrupts. Allowing this in user-land is a major security breach.
I know. But in ioport.c you do CLI and then STI immediately. What it does?
I guess not. Which address? Anyway, I guess it should ioremap those...
....
Some of those addresses are quite low but unfortunately I don't have an idea what they're used for.
The Cardbus controller uses these addresses, according to windows Device manager:
0xb8008000-0xb8008fff 0xfebff000-0xfebfffff 0xfabff000-0xfebfefff 0x000db000-0x000dbfff
and modem itself uses none, only i/o and irq
Also, I searched logs and found that piece:
Oct 21 04:54:14 localhost kernel: Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled Oct 21 04:54:14 localhost kernel: cs: memory probe 0xb8000000-0xb80fffff:__l4x_ioremap: Requested region at b8010000 [0x1000 Bytes] Oct 21 04:54:14 localhost kernel: cs: memory probe 0x70000000-0x700fffff: excluding 0x70000000-0x700fffff Oct 21 04:54:14 localhost kernel: cs: warning: no high memory space available!
(This was in the log with info level (there are also error and warning levels)), so, as I understood, this is not fatal. Also here are ioremaps, so, they are present.
WBR, valery
On Tue Oct 28, 2008 at 02:53:27 +1200, Valery V. Sedletski wrote:
On Mon, 27 Oct 2008 01:06:43 +0100, Adam Lackorzynski wrote:
On Thu Oct 23, 2008 at 21:54:08 +1300, Valery V. Sedletski wrote:
On Wed, 22 Oct 2008 19:12:59 +0200, Adam Lackorzynski wrote:
...skipped...
So, this is iret instruction somewhere on return from a syscall, if I understood correctly. Maybe, my Fiasco config is wrong?
Rather not. Can you tell me at which position the f.2 is in user-land? That would be interesting.
This is the result of executing "tf.2" in kernel debugger:
thread: f.02 <001e0801> prio: a0 mcp: ff mode: Con state: 001 ready %dl 001e0804 into wait for: polling: rcv descr: lcked by: ---.-- timeout : cpu time: 2.000 ms timeslice: 4000/10000 µs pager : f.00 cap: ---.-- utcb: ffe34200 preemptr: 0.00 not monitored ready lnk: ???.?? 5.00 prsnt lnk: f.04 f.00 EAX=00000000 ESI=001e0801 DS=0023 EBX=a0051454 EDI=00000000 ES=0023 iret ECX=0000000d EBP=afeffe40 GS=0043 lea 0x0(%esi),%esi EDX=afeffe58 ESP=c03c17ec SS=0010 trap 13 (General Protection), error 00000070, from kernel mode CS=0008 EIP=f003622b EFlags=00203246 001e0815 cli c03c17ec f003622b 00000008 00203246 [eacff102] 00000073 00203296 afeffe00 0000007b 001e0818 into
the [eacff102] dword was highlighted in backtrace. So, as I understood this, this marks current stack frame. Here we see that previous stack frame is ip=0xf003622b, cs=0x8, eflags=0x00203246 is respectively ip, cs and eflags taken off from the stack by previous iret instruction. Current iret must take off the next frame: eip=0xeacff102, cs=0x73, eflags=0x00203296. Here we can see that selector cs=0x73 is ring3 and eip is 0xeacff102.
Yep, ok, cs value is bogus. Now we need to find out why, cs should be 1b. (8 is kernel code, so ok.) Maybe I find the time to reproduce this but on the other side it does not happen when the IOPL3 option is off, right?
And also, unassembly at eip=0xeacff102:
eacff100 int $0x31 eacff102 ret
-- An entering the syscall?
Yes, this is from the syscall page (kernel entry code is there).
<001e0801> -- What is it? If it is an address from where int 0x32 is called then instructions at this
an address 001e0800 (into) is not in CS segment of vmlinux executable (no such address in objdump output) So, what is it? Corrupt stack? Or it is not a return address but something different?
It's the thread-ID, so nothing to disassemble :)
X is very demanding wrt permissions. More recent version should behave better though.
And the question: what does cli instruction do? I guess this must trigger an I/O page fault for I/O pages to be mapped? Am I right?
CLI disables interrupts. Allowing this in user-land is a major security breach.
I know. But in ioport.c you do CLI and then STI immediately. What it does?
It tries to find out the IOPL, which only works if the option in the kernel is enabled. That's why it was commented out.
I guess not. Which address? Anyway, I guess it should ioremap those...
....
Some of those addresses are quite low but unfortunately I don't have an idea what they're used for.
The Cardbus controller uses these addresses, according to windows Device manager:
0xb8008000-0xb8008fff 0xfebff000-0xfebfffff 0xfabff000-0xfebfefff 0x000db000-0x000dbfff
and modem itself uses none, only i/o and irq
Also, I searched logs and found that piece:
Oct 21 04:54:14 localhost kernel: Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled Oct 21 04:54:14 localhost kernel: cs: memory probe 0xb8000000-0xb80fffff:__l4x_ioremap: Requested region at b8010000 [0x1000 Bytes] Oct 21 04:54:14 localhost kernel: cs: memory probe 0x70000000-0x700fffff: excluding 0x70000000-0x700fffff Oct 21 04:54:14 localhost kernel: cs: warning: no high memory space available!
(This was in the log with info level (there are also error and warning levels)), so, as I understood, this is not fatal. Also here are ioremaps, so, they are present.
Looks like this stuff is doing someting quite special, at least when looking at the code that is producing those log lines. I fear it's not easily doable to get this going.
Adam
On Tue, 28 Oct 2008 17:48:44 +0100, Adam Lackorzynski wrote:
Rather not. Can you tell me at which position the f.2 is in user-land? That would be interesting.
This is the result of executing "tf.2" in kernel debugger:
thread: f.02 <001e0801> prio: a0 mcp: ff mode: Con state: 001 ready %dl 001e0804 into wait for: polling: rcv descr: lcked by: ---.-- timeout : cpu time: 2.000 ms timeslice: 4000/10000 ╣s pager : f.00 cap: ---.-- utcb: ffe34200 preemptr: 0.00 not monitored ready lnk: ???.?? 5.00 prsnt lnk: f.04 f.00 EAX=00000000 ESI=001e0801 DS=0023 EBX=a0051454 EDI=00000000 ES=0023 iret ECX=0000000d EBP=afeffe40 GS=0043 lea 0x0(%esi),%esi EDX=afeffe58 ESP=c03c17ec SS=0010 trap 13 (General Protection), error 00000070, from kernel mode CS=0008 EIP=f003622b EFlags=00203246 001e0815 cli c03c17ec f003622b 00000008 00203246 [eacff102] 00000073 00203296 afeffe00 0000007b 001e0818 into
the [eacff102] dword was highlighted in backtrace. So, as I understood this, this marks current stack frame. Here we see that previous stack frame is ip=0xf003622b, cs=0x8, eflags=0x00203246 is respectively ip, cs and eflags taken off from the stack by previous iret instruction. Current iret must take off the next frame: eip=0xeacff102, cs=0x73, eflags=0x00203296. Here we can see that selector cs=0x73 is ring3 and eip is 0xeacff102.
Yep, ok, cs value is bogus. Now we need to find out why, cs should be 1b. (8 is kernel code, so ok.) Maybe I find the time to reproduce this but on the other side it does not happen when the IOPL3 option is off, right?
CS is incorrect? That's why it traps... I tried to set breakpoint before TRAP d in linux (in ioport.c when it writes "Got %d out of %d I/O ports") -- set "-wait" in fiasco command line and when it broke in kernel debugger after fiasco started, tried to set breakpoint:
bi addr=4054ec b+ bpn=1 br bpn=1 task=f
But after I said "g" in the debugger it didn't stopped at the breakpoint, only broke in TRAP d. What I do incorrectly? Maybe, bi addr=4054ec sets the breakpoint at phys address, not to linear of task f? How then I can set to linear (or convert linear to physical?).
PS: If needed, I can make bootable iso image with my setup. Is it needed?
an address 001e0800 (into) is not in CS segment of vmlinux executable (no such address in objdump output) So, what is it? Corrupt stack? Or it is not a return address but something different?
It's the thread-ID, so nothing to disassemble :)
Yes, I thought of it, but was not sure :)
I guess not. Which address? Anyway, I guess it should ioremap those...
....
Some of those addresses are quite low but unfortunately I don't have an idea what they're used for.
The Cardbus controller uses these addresses, according to windows Device manager:
0xb8008000-0xb8008fff 0xfebff000-0xfebfffff 0xfabff000-0xfebfefff 0x000db000-0x000dbfff
and modem itself uses none, only i/o and irq
Also, I searched logs and found that piece:
Oct 21 04:54:14 localhost kernel: Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled Oct 21 04:54:14 localhost kernel: cs: memory probe 0xb8000000-0xb80fffff: l4x ioremap: Requested region at b8010000 [0x1000 Bytes] Oct 21 04:54:14 localhost kernel: cs: memory probe 0x70000000-0x700fffff: excluding 0x70000000-0x700fffff Oct 21 04:54:14 localhost kernel: cs: warning: no high memory space available!
(This was in the log with info level (there are also error and warning levels)), so, as I understood, this is not fatal. Also here are ioremaps, so, they are present.
Looks like this stuff is doing someting quite special, at least when looking at the code that is producing those log lines. I fear it's not easily doable to get this going.
Hmm... Sad. -- I wanted to experiment with it while the inet access working, but I turns that I can't. But interestingly, bluetooth comport is working and GPRS access to internet through it either. So, I could try this through bluetooth and my phone, if not PCMCIA. Bad but not worst ;) Anyway, I can try pcmcia modem again when TRAP d will be resolved, a warning with High memory access may be not so important and it will work. Or not?
PS: I searched L4linux sources for string "no high memory space" and found nothing. Strange. Could you point me where it is, please?
On Wed Oct 29, 2008 at 16:25:49 +1300, Valery V. Sedletski wrote:
On Tue, 28 Oct 2008 17:48:44 +0100, Adam Lackorzynski wrote:
Rather not. Can you tell me at which position the f.2 is in user-land? That would be interesting.
This is the result of executing "tf.2" in kernel debugger:
thread: f.02 <001e0801> prio: a0 mcp: ff mode: Con state: 001 ready %dl 001e0804 into wait for: polling: rcv descr: lcked by: ---.-- timeout : cpu time: 2.000 ms timeslice: 4000/10000 µs pager : f.00 cap: ---.-- utcb: ffe34200 preemptr: 0.00 not monitored ready lnk: ???.?? 5.00 prsnt lnk: f.04 f.00 EAX=00000000 ESI=001e0801 DS=0023 EBX=a0051454 EDI=00000000 ES=0023 iret ECX=0000000d EBP=afeffe40 GS=0043 lea 0x0(%esi),%esi EDX=afeffe58 ESP=c03c17ec SS=0010 trap 13 (General Protection), error 00000070, from kernel mode CS=0008 EIP=f003622b EFlags=00203246 001e0815 cli c03c17ec f003622b 00000008 00203246 [eacff102] 00000073 00203296 afeffe00 0000007b 001e0818 into
the [eacff102] dword was highlighted in backtrace. So, as I understood this, this marks current stack frame. Here we see that previous stack frame is ip=0xf003622b, cs=0x8, eflags=0x00203246 is respectively ip, cs and eflags taken off from the stack by previous iret instruction. Current iret must take off the next frame: eip=0xeacff102, cs=0x73, eflags=0x00203296. Here we can see that selector cs=0x73 is ring3 and eip is 0xeacff102.
Yep, ok, cs value is bogus. Now we need to find out why, cs should be 1b. (8 is kernel code, so ok.) Maybe I find the time to reproduce this but on the other side it does not happen when the IOPL3 option is off, right?
CS is incorrect? That's why it traps... I tried to set breakpoint before TRAP d in linux (in ioport.c when it writes "Got %d out of %d I/O ports") -- set "-wait" in fiasco command line and when it broke in kernel debugger after fiasco started, tried to set breakpoint:
bi addr=4054ec b+ bpn=1 br bpn=1 task=f
But after I said "g" in the debugger it didn't stopped at the breakpoint, only broke in TRAP d. What I do incorrectly? Maybe, bi addr=4054ec sets the breakpoint at phys address, not to linear of task f? How then I can set to linear (or convert linear to physical?).
Breakpoint addresses are virtual.
Anyway, I found the reason. L4Linux had so much privileges that it was able able to write MSRs which allowed it to setup the Sysenter-MSRs. Normally this is trapped and handled. The kernel allows writing MSRs for special purpose stuff, this was triggered here. The wrmsr's are in enable_sep_cpu in arch/x86/vdso/vdso32-setup.c, if you comment them out it should work.
I guess not. Which address? Anyway, I guess it should ioremap those...
....
Some of those addresses are quite low but unfortunately I don't have an idea what they're used for.
The Cardbus controller uses these addresses, according to windows Device manager:
0xb8008000-0xb8008fff 0xfebff000-0xfebfffff 0xfabff000-0xfebfefff 0x000db000-0x000dbfff
and modem itself uses none, only i/o and irq
Also, I searched logs and found that piece:
Oct 21 04:54:14 localhost kernel: Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled Oct 21 04:54:14 localhost kernel: cs: memory probe 0xb8000000-0xb80fffff: l4x ioremap: Requested region at b8010000 [0x1000 Bytes] Oct 21 04:54:14 localhost kernel: cs: memory probe 0x70000000-0x700fffff: excluding 0x70000000-0x700fffff Oct 21 04:54:14 localhost kernel: cs: warning: no high memory space available!
(This was in the log with info level (there are also error and warning levels)), so, as I understood, this is not fatal. Also here are ioremaps, so, they are present.
Looks like this stuff is doing someting quite special, at least when looking at the code that is producing those log lines. I fear it's not easily doable to get this going.
Hmm... Sad. -- I wanted to experiment with it while the inet access working, but I turns that I can't. But interestingly, bluetooth comport is working and GPRS access to internet through it either. So, I could try this through bluetooth and my phone, if not PCMCIA. Bad but not worst ;) Anyway, I can try pcmcia modem again when TRAP d will be resolved, a warning with High memory access may be not so important and it will work. Or not?
I would guess not.
PS: I searched L4linux sources for string "no high memory space" and found nothing. Strange. Could you point me where it is, please?
It's in drivers/pcmcia/rsrc_nonstatic.c:434
Adam
bi addr=4054ec b+ bpn=1 br bpn=1 task=f
But after I said "g" in the debugger it didn't stopped at the breakpoint, only broke in TRAP d. What I do incorrectly? Maybe, bi addr=4054ec sets the breakpoint at phys address, not to linear of task f? How then I can set to linear (or convert linear to physical?).
Breakpoint addresses are virtual.
Anyway, I found the reason. L4Linux had so much privileges that it was able able to write MSRs which allowed it to setup the Sysenter-MSRs. Normally this is trapped and handled. The kernel allows writing MSRs for special purpose stuff, this was triggered here. The wrmsr's are in enable_sep_cpu in arch/x86/vdso/vdso32-setup.c, if you comment them out it should work.
Good. Thank you very much, I am very glad to see it now works ;) I commented out that fragment and recompiled. Now Linux starts without problems. But another problem appeared: when I start X, the following message appears in log:
l4lx | [F.1] semaphore/lib/src/semaphore.c:339:l4semaphore_thread(): l4lx | Error: L4semaphore: ignored request from other task (1B.00, I'm F.01 l4lx : )!
Linux stops with black screen. :( The run program shows that application with task Id = 0x1b is Linux application (it says, "owner: f.4"). It seems it is X server. I start X server from a drops-fp.rd ramdisk image from TUD site. The problem seems to be that X server runs as another task and Linux accepts only requests from its threads.
WBR, valery
On Mon Nov 03, 2008 at 06:48:45 +1200, valerius wrote:
bi addr=4054ec b+ bpn=1 br bpn=1 task=f
But after I said "g" in the debugger it didn't stopped at the breakpoint, only broke in TRAP d. What I do incorrectly? Maybe, bi addr=4054ec sets the breakpoint at phys address, not to linear of task f? How then I can set to linear (or convert linear to physical?).
Breakpoint addresses are virtual.
Anyway, I found the reason. L4Linux had so much privileges that it was able able to write MSRs which allowed it to setup the Sysenter-MSRs. Normally this is trapped and handled. The kernel allows writing MSRs for special purpose stuff, this was triggered here. The wrmsr's are in enable_sep_cpu in arch/x86/vdso/vdso32-setup.c, if you comment them out it should work.
Good. Thank you very much, I am very glad to see it now works ;) I commented out that fragment and recompiled. Now Linux starts without problems. But another problem appeared: when I start X, the following message appears in log:
l4lx | [F.1] semaphore/lib/src/semaphore.c:339:l4semaphore_thread(): l4lx | Error: L4semaphore: ignored request from other task (1B.00, I'm F.01 l4lx : )!
Linux stops with black screen. :( The run program shows that application with task Id = 0x1b is Linux application (it says, "owner: f.4"). It seems it is X server. I start X server from a drops-fp.rd ramdisk image from TUD site. The problem seems to be that X server runs as another task and Linux accepts only requests from its threads.
Do you mean drops-rd.rd? There's no X server in there. What did you use? The above message would mean that the X server is talking to the L4Linux-server which is a no-go except the special X driver is used.
Adam
В сообщении от Tuesday 04 November 2008 09:44:43 Adam Lackorzynski написал(а):
l4lx | [F.1] semaphore/lib/src/semaphore.c:339:l4semaphore_thread(): l4lx | Error: L4semaphore: ignored request from other task (1B.00, I'm F.01 l4lx : )!
Linux stops with black screen. :( The run program shows that application with task Id = 0x1b is Linux application (it says, "owner: f.4"). It seems it is X server. I start X server from a drops-fp.rd ramdisk image from TUD site. The problem seems to be that X server runs as another task and Linux accepts only requests from its threads.
Do you mean drops-rd.rd? There's no X server in there. What did you use?
as I checked this, it's the same as drops-x.rd (comparing two files with cmp gives no difference) from DROPS/TUD:OS demo CD (I downloaded it about 1 year ago).
The above message would mean that the X server is talking to the L4Linux-server which is a no-go except the special X driver is used.
I enabled X server stub in L4Linux configuration (a checkbox "Support for X Window System driver" is enabled). And nitovlwm server is started. What could I miss then?
WBR, valery
В сообщении от Tuesday 04 November 2008 09:44:43 Adam Lackorzynski написал(а):
l4lx | [F.1] semaphore/lib/src/semaphore.c:339:l4semaphore_thread(): l4lx | Error: L4semaphore: ignored request from other task (1B.00, I'm F.01 l4lx : )!
Linux stops with black screen. :( The run program shows that application with task Id = 0x1b is Linux application (it says, "owner: f.4"). It seems it is X server. I start X server from a drops-fp.rd ramdisk image from TUD site. The problem seems to be that X server runs as another task and Linux accepts only requests from its threads.
Do you mean drops-rd.rd? There's no X server in there. What did you use?
as I checked this, it's the same as drops-x.rd (comparing two files with cmp gives no difference) from DROPS/TUD:OS demo CD (I downloaded it about 1 year ago).
The above message would mean that the X server is talking to the L4Linux-server which is a no-go except the special X driver is used.
I enabled X server stub in L4Linux configuration (a checkbox "Support for X Window System driver" is enabled). And nitovlwm server is started. What could I miss then?
WBR, valery
On Tue Nov 04, 2008 at 14:32:50 +1200, valerius wrote:
÷ ÓÏÏÂÝÅÎÉÉ ÏÔ Tuesday 04 November 2008 09:44:43 Adam Lackorzynski ÎÁÐÉÓÁÌ(Á):
l4lx | [F.1] semaphore/lib/src/semaphore.c:339:l4semaphore_thread(): l4lx | Error: L4semaphore: ignored request from other task (1B.00, I'm F.01 l4lx : )!
Linux stops with black screen. :( The run program shows that application with task Id = 0x1b is Linux application (it says, "owner: f.4"). It seems it is X server. I start X server from a drops-fp.rd ramdisk image from TUD site. The problem seems to be that X server runs as another task and Linux accepts only requests from its threads.
Do you mean drops-rd.rd? There's no X server in there. What did you use?
as I checked this, it's the same as drops-x.rd (comparing two files with cmp gives no difference) from DROPS/TUD:OS demo CD (I downloaded it about 1 year ago).
Ok.
The above message would mean that the X server is talking to the L4Linux-server which is a no-go except the special X driver is used.
I enabled X server stub in L4Linux configuration (a checkbox "Support for X Window System driver" is enabled). And nitovlwm server is started. What could I miss then?
This setup is not the easiest one and not easy to setup, it also requires quite an effort for me to get this running. I recommend using the X 'fbdev' driver which basically just works out of the box. It isn't as CPU friendly as it could be but the handling is easy. Using a special X driver is better in several ways but requires some care.
Adam
On Wed, 5 Nov 2008 22:25:19 +0100, Adam Lackorzynski wrote:
The above message would mean that the X server is talking to the L4Linux-server which is a no-go except the special X driver is used.
I enabled X server stub in L4Linux configuration (a checkbox "Support for X Window System driver" is enabled). And nitovlwm server is started. What could I miss then?
This setup is not the easiest one and not easy to setup, it also requires quite an effort for me to get this running. I recommend using the X 'fbdev' driver which basically just works out of the box. It isn't as CPU friendly as it could be but the handling is easy. Using a special X driver is better in several ways but requires some care.
Yes, I know that this setup is not easy. For overlay_wm driver to work it is necessary to apply patches to X Server sources, as I read. But I thought it must work with linux server with ordinary l4con/dope frambuffer driver. So, it won't, ok. -- When I checked on real machine (not with ramdisk), l4linux started and XServer starts OK.
But again the new problem -- with udev. First time for some reason udev didn't started and /sys filesystem did not mounted (it was some problem with initial ramdisk). When I mounted /sys manually, created all device nodes manually and started udevd from the command line, almost all worked -- bluetooth, usb, etc. But when I made initrd properly, I got an error "Kernel panic: not syncing, Out of memory and no killable processes" when starting udev. I can't copy/paste the boot log as nothing were saved in logs. How can I report this problem? I can't also redirect output to file as system is not syncing (I booted with init=/bin/sh successfully and then panic message appears if launch /sbin/start_udev from the command line).
I tried to upgrade udev to the latest version (I have ver. 1.14 and the newest one is 1.30) but no effect. Also I searched for this error message in google and found a few references to mailing lists where Fedora and Slackware users mentioned similar problem -- it was not with udev itself but with latest kernels -- with 2.6.24 this were OK, but with 2.6.25 and later were these problems. So, I decided to try downgrading l4linux kernel. I tried many revisions from SVN from 2.6.27 down to 2.6.22 with no success -- Linux was either panicing or handing when starting udev.
And the last, I tried latest linux kernels (2.6.27 and 2.6.28) without L4 -- they boot successfully and don't panic when starting udev. So, I figured out that this is L4Linux-related problem as in ordinary linux kernels the problem seems to disappear.
So, must I report this as a bug? And what additional information I can give to solve the problem I have Mandriva 2008.0 distribution installed with linux kernel 2.6.22. It is installed on notebook with 2 Gb of memory and 1.7 Ghz Pentium M (with Centrino) processor. I tried to give linux server 800 to 1800 Mb of memory and tried to play with initrd size, but with no effect. My notebook has no serial port, so I can't save the log. But I can install linux to my desktop machine, if needed. It has a serial port and serial cable.
WBR, valery
Hello Valery,
On Tue, Nov 11, 2008 at 01:20:20AM +1200, Valery V. Sedletski wrote:
My notebook has no serial port, so I can't save the log. But I can install linux to my desktop machine, if needed. It has a serial port and serial cable.
What system is running on your desktop? You could use any other terminal program, e.g., HyperTerminal on Windows, store the log, and send it to the list.
Cheers
Sorry, I should have read your post with more attention. Please ignore may anser :(
On Mon, Nov 10, 2008 at 02:06:39PM +0100, Christian Helmuth wrote:
Hello Valery,
On Tue, Nov 11, 2008 at 01:20:20AM +1200, Valery V. Sedletski wrote:
My notebook has no serial port, so I can't save the log. But I can install linux to my desktop machine, if needed. It has a serial port and serial cable.
What system is running on your desktop? You could use any other terminal program, e.g., HyperTerminal on Windows, store the log, and send it to the list.
Cheers
Christian Helmuth Genode Labs
http://www.genode-labs.com/ · http://genode.org/
l4-hackers mailing list l4-hackers@os.inf.tu-dresden.de http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers
On Mon, 10 Nov 2008 14:11:10 +0100, Christian Helmuth wrote:
Sorry, I should have read your post with more attention. Please ignore may anser :(
On Mon, Nov 10, 2008 at 02:06:39PM +0100, Christian Helmuth wrote:
Hello Valery,
On Tue, Nov 11, 2008 at 01:20:20AM +1200, Valery V. Sedletski wrote:
My notebook has no serial port, so I can't save the log. But I can install linux to my desktop machine, if needed. It has a serial port and serial cable.
What system is running on your desktop? You could use any other terminal program, e.g., HyperTerminal on Windows, store the log, and send it to the list.
Yes, I have no serial port on my notebook. But on desktop I have one. So, I decided to test on the desktop, I think I could reproduce it on different machine. And yes, I have the second (third) machine with terminal program, so this is not a problem.
On Tue Nov 11, 2008 at 01:20:20 +1200, Valery V. Sedletski wrote:
On Wed, 5 Nov 2008 22:25:19 +0100, Adam Lackorzynski wrote:
The above message would mean that the X server is talking to the L4Linux-server which is a no-go except the special X driver is used.
I enabled X server stub in L4Linux configuration (a checkbox "Support for X Window System driver" is enabled). And nitovlwm server is started. What could I miss then?
This setup is not the easiest one and not easy to setup, it also requires quite an effort for me to get this running. I recommend using the X 'fbdev' driver which basically just works out of the box. It isn't as CPU friendly as it could be but the handling is easy. Using a special X driver is better in several ways but requires some care.
Yes, I know that this setup is not easy. For overlay_wm driver to work it is necessary to apply patches to X Server sources, as I read.
It's a driver.
But I thought it must work with linux server with ordinary l4con/dope frambuffer driver. So, it won't, ok.
It's really supposed to work with the 'fbdev' X driver. At least it does for me.
-- When I checked on real machine (not with ramdisk), l4linux started and XServer starts OK.
But again the new problem -- with udev. First time for some reason udev didn't started and /sys filesystem did not mounted (it was some problem with initial ramdisk). When I mounted /sys manually, created all device nodes manually and started udevd from the command line, almost all worked -- bluetooth, usb, etc. But when I made initrd properly, I got an error "Kernel panic: not syncing, Out of memory and no killable processes" when starting udev. I can't copy/paste the boot log as nothing were saved in logs. How can I report this problem? I can't also redirect output to file as system is not syncing (I booted with init=/bin/sh successfully and then panic message appears if launch /sbin/start_udev from the command line).
It would be helpful if you could construct an image that shows the problem and which I could have to look at. Otherwise it's hard to me to tell what's going on in your setup.
I tried to upgrade udev to the latest version (I have ver. 1.14 and the newest one is 1.30) but no effect. Also I searched for this error message in google and found a few references to mailing lists where Fedora and Slackware users mentioned similar problem -- it was not with udev itself but with latest kernels -- with 2.6.24 this were OK, but with 2.6.25 and later were these problems. So, I decided to try downgrading l4linux kernel. I tried many revisions from SVN from 2.6.27 down to 2.6.22 with no success -- Linux was either panicing or handing when starting udev.
I think downgrading Linux is not the way fixing this. Knowing the real reason would be very helpful. Did you try strace e.g.?
And the last, I tried latest linux kernels (2.6.27 and 2.6.28) without L4 -- they boot successfully and don't panic when starting udev. So, I figured out that this is L4Linux-related problem as in ordinary linux kernels the problem seems to disappear.
Might very well be the case.
So, must I report this as a bug? And what additional information I can give to solve the problem I have Mandriva 2008.0 distribution installed with linux kernel 2.6.22. It is installed on notebook with 2 Gb of memory and 1.7 Ghz Pentium M (with Centrino) processor. I tried to give linux server 800 to 1800 Mb of memory and tried to play with initrd size, but with no effect. My notebook has no serial port, so I can't save the log. But I can install linux to my desktop machine, if needed. It has a serial port and serial cable.
Can you build me a little example which shows the problem? I think that would be quite helpful.
Adam
On Wed, 12 Nov 2008 23:13:33 +0100, Adam Lackorzynski wrote:
On Tue Nov 11, 2008 at 01:20:20 +1200, Valery V. Sedletski wrote:
On Wed, 5 Nov 2008 22:25:19 +0100, Adam Lackorzynski wrote:
The above message would mean that the X server is talking to the L4Linux-server which is a no-go except the special X driver is used.
I enabled X server stub in L4Linux configuration (a checkbox "Support for X Window System driver" is enabled). And nitovlwm server is started. What could I miss then?
This setup is not the easiest one and not easy to setup, it also requires quite an effort for me to get this running. I recommend using the X 'fbdev' driver which basically just works out of the box. It isn't as CPU friendly as it could be but the handling is easy. Using a special X driver is better in several ways but requires some care.
Yes, I know that this setup is not easy. For overlay_wm driver to work it is necessary to apply patches to X Server sources, as I read.
It's a driver.
But I thought it must work with linux server with ordinary l4con/dope frambuffer driver. So, it won't, ok.
It's really supposed to work with the 'fbdev' X driver. At least it does for me.
I meant that special driver inside X Server is needed. (for seamless integration of X with Nitpicker desktop). I just thought that X version in drops-x.rd ramdisk will work with any L4Linux setup, but it appears that not (not yet understand why, maybe X stub in different L4Linux versions talks other way with seamless X Server driver and need to be recompiled each time? -- backward incompatible protocols etc.)
-- When I checked on real machine (not with ramdisk), l4linux started and XServer starts OK.
But again the new problem -- with udev. First time for some reason udev didn't started and /sys filesystem did not mounted (it was some problem with initial ramdisk). When I mounted /sys manually, created all device nodes manually and started udevd from the command line, almost all worked -- bluetooth, usb, etc. But when I made initrd properly, I got an error "Kernel panic: not syncing, Out of memory and no killable processes" when starting udev. I can't copy/paste the boot log as nothing were saved in logs. How can I report this problem? I can't also redirect output to file as system is not syncing (I booted with init=/bin/sh successfully and then panic message appears if launch /sbin/start_udev from the command line).
It would be helpful if you could construct an image that shows the problem and which I could have to look at. Otherwise it's hard to me to tell what's going on in your setup.
I'll try making bootable iso image. But maybe, my problem is not easy to reproduce on different machine -- I tried to install the same Linux distribution on my desktop machine and copied there the same compiled L4Linux kernel. Now L4Linux boots successfully without panic, but displays errors with call trace (non- fatal -- no panic but console hangs). The errors appear during the loading agpgart and amd64-agp modules. (This machine is AMD Athlon 64 with VIA chipset and 1 Gig RAM and Matrox G400 card in AGP slot). When I renamed these drivers (I yet not found where in Linux they are switched off -- the 2.6 version kernel is new for me, and I need a time to understand udev configs) it loads OK without problems at all. On my notebook I have ATI Radeon X300 Mobility PCI-express card, so, loading AGP drivers have no sense. As with desktop machine, I supposed that on notebook kernel panic occurs when loading some driver (maybe, video related -- I am not sure, if it makes a sense to load them in l4linux), but when I renamed all suspectable drivers, there were no errors with call traces, but the single kernel panic message remains, without other messages which could tell what causes it. As I said, the notebook has no comports (I only have PCMCIA serial card and USB-to-serial converter); but desktop has them -- when I attached a nullmodem cable to the desktop (and connected from the notebook with terminal program using usbserial converter), I got debug messages from log server, but found nothing suspected (I could post logs there, but there are no call traces or other error messages in them -- maybe, there is a way to say linux to redirect error messages and call traces to log server?) Maybe, there is something to redirect log server messages through network or similar way, not via commport? Or, maybe, it's possible to redirect linux messages to log server and then from the same machine, to get/view these logs using l4env servers, not linux itself (linux is dead but l4env remains alive and maybe, I can get some post-mortem info, like logs or task memory dump or etc? Or get and save a trace buffer or similar?) (Just an idea -- what if include simple http server in l4env and serve log server messages, so, they can be viewed from any web browser? Maybe, something like this already exist?)
I tried to upgrade udev to the latest version (I have ver. 1.14 and the newest one is 1.30) but no effect. Also I searched for this error message in google and found a few references to mailing lists where Fedora and Slackware users mentioned similar problem -- it was not with udev itself but with latest kernels -- with 2.6.24 this were OK, but with 2.6.25 and later were these problems. So, I decided to try downgrading l4linux kernel. I tried many revisions from SVN from 2.6.27 down to 2.6.22 with no success -- Linux was either panicing or handing when starting udev.
I think downgrading Linux is not the way fixing this. Knowing the real reason would be very helpful. Did you try strace e.g.?
Strace? What is it? Is this a Linux or Fiasco feature? (ok, silly question, I'll try googling)
Can you build me a little example which shows the problem? I think that would be quite helpful.
I'll make a bootable ISO image and upload it somewhere. Is rapidshare or similar servers OK?
WBR, valery,
valerius @ EFnet IRC #os2russian, #osFree www.osFree.org
On Fri, 14 Nov 2008 19:59:19 +1200 (MSK), Valery V. Sedletski wrote:
On Wed, 12 Nov 2008 23:13:33 +0100, Adam Lackorzynski wrote:
I think downgrading Linux is not the way fixing this. Knowing the real reason would be very helpful. Did you try strace e.g.?
Strace? What is it? Is this a Linux or Fiasco feature? (ok, silly question, I'll try googling)
I tried booting with init=/bin/sh and executing "strace /sbin/start_udev" after manually mounting /sys and /proc. (udevd exits when /sys is not mounted). It prints a syscall trace, but it get running quickly, and at the end of process I get a Kernel panic message with call trace messages which are filling all screen and I can't see strace output as I can't scroll screen buffer back by Shift-PgUp because linux is dead, Also I tried to redirect strace messages to file but filesystems are mounted readonly and can't write files to disk. Also when panic occurs, linux doesn't sync files with cache. So, I'm in a spot. :( What else can I do?
WBR, valery
On Fri Nov 14, 2008 at 19:59:19 +1200, Valery V. Sedletski wrote:
On Wed, 12 Nov 2008 23:13:33 +0100, Adam Lackorzynski wrote:
On Tue Nov 11, 2008 at 01:20:20 +1200, Valery V. Sedletski wrote:
On Wed, 5 Nov 2008 22:25:19 +0100, Adam Lackorzynski wrote:
The above message would mean that the X server is talking to the L4Linux-server which is a no-go except the special X driver is used.
I enabled X server stub in L4Linux configuration (a checkbox "Support for X Window System driver" is enabled). And nitovlwm server is started. What could I miss then?
This setup is not the easiest one and not easy to setup, it also requires quite an effort for me to get this running. I recommend using the X 'fbdev' driver which basically just works out of the box. It isn't as CPU friendly as it could be but the handling is easy. Using a special X driver is better in several ways but requires some care.
Yes, I know that this setup is not easy. For overlay_wm driver to work it is necessary to apply patches to X Server sources, as I read.
It's a driver.
But I thought it must work with linux server with ordinary l4con/dope frambuffer driver. So, it won't, ok.
It's really supposed to work with the 'fbdev' X driver. At least it does for me.
I meant that special driver inside X Server is needed. (for seamless integration of X with Nitpicker desktop). I just thought that X version in drops-x.rd ramdisk will work with any L4Linux setup, but it appears that not (not yet understand why, maybe X stub in different L4Linux versions talks other way with seamless X Server driver and need to be recompiled each time? -- backward incompatible protocols etc.)
Unfortunately cannot tell what might be wrong without looking deeply into this.
-- When I checked on real machine (not with ramdisk), l4linux started and XServer starts OK.
But again the new problem -- with udev. First time for some reason udev didn't started and /sys filesystem did not mounted (it was some problem with initial ramdisk). When I mounted /sys manually, created all device nodes manually and started udevd from the command line, almost all worked -- bluetooth, usb, etc. But when I made initrd properly, I got an error "Kernel panic: not syncing, Out of memory and no killable processes" when starting udev. I can't copy/paste the boot log as nothing were saved in logs. How can I report this problem? I can't also redirect output to file as system is not syncing (I booted with init=/bin/sh successfully and then panic message appears if launch /sbin/start_udev from the command line).
It would be helpful if you could construct an image that shows the problem and which I could have to look at. Otherwise it's hard to me to tell what's going on in your setup.
I'll try making bootable iso image. But maybe, my problem is not easy to reproduce on different machine -- I tried to install the same Linux distribution on my desktop machine and copied there the same compiled L4Linux kernel. Now L4Linux boots successfully without panic, but displays errors with call trace (non- fatal -- no panic but console hangs). The errors appear during the loading agpgart and amd64-agp modules. (This machine is AMD Athlon 64 with VIA chipset and 1 Gig RAM and Matrox G400 card in AGP slot). When I renamed these drivers (I yet not found where in Linux they are switched off -- the 2.6 version kernel is new for me, and I need a time to understand udev configs) it loads OK without problems at all.
Those low-level things like GART are not unproblematic, best avoid them if they seem to make problems (or proper analyse what's going on, much harder...).
On my notebook I have ATI Radeon X300 Mobility PCI-express card, so, loading AGP drivers have no sense. As with desktop machine, I supposed that on notebook kernel panic occurs when loading some driver (maybe, video related -- I am not sure, if it makes a sense to load them in l4linux), but when I renamed all suspectable drivers, there were no errors with call traces, but the single kernel panic message remains, without other messages which could tell what causes it.
Tried earlyprintk=1 on the Linux kernel command line?
As I said, the notebook has no comports (I only have PCMCIA serial card and USB-to-serial converter); but desktop has them -- when I attached a nullmodem cable to the desktop (and connected from the notebook with terminal program using usbserial converter), I got debug messages from log server, but found nothing suspected (I could post logs there, but there are no call traces or other error messages in them -- maybe, there is a way to say linux to redirect error messages and call traces to log server?)
There is. In the L4Linux configuration, in the Stub-drivers sub-menu, enable 'Serial driver' and then use console=ttyLv0 on the kernel command line. This should do the trick.
Maybe, there is something to redirect log server messages through network or similar way, not via commport? Or, maybe, it's possible to redirect linux messages to log server and then from the same machine, to get/view these logs using l4env servers, not linux itself (linux is dead but l4env remains alive and maybe, I can get some post-mortem info, like logs or task memory dump or etc? Or get and save a trace buffer or similar?) (Just an idea -- what if include simple http server in l4env and serve log server messages, so, they can be viewed from any web browser? Maybe, something like this already exist?)
There's the dmon application which is a DOpE application and a log-server, i.e. it display log-messages in a window. The other ideas are nice but do not exist.
I tried to upgrade udev to the latest version (I have ver. 1.14 and the newest one is 1.30) but no effect. Also I searched for this error message in google and found a few references to mailing lists where Fedora and Slackware users mentioned similar problem -- it was not with udev itself but with latest kernels -- with 2.6.24 this were OK, but with 2.6.25 and later were these problems. So, I decided to try downgrading l4linux kernel. I tried many revisions from SVN from 2.6.27 down to 2.6.22 with no success -- Linux was either panicing or handing when starting udev.
I think downgrading Linux is not the way fixing this. Knowing the real reason would be very helpful. Did you try strace e.g.?
Strace? What is it? Is this a Linux or Fiasco feature? (ok, silly question, I'll try googling)
Can you build me a little example which shows the problem? I think that would be quite helpful.
I'll make a bootable ISO image and upload it somewhere. Is rapidshare or similar servers OK?
Something where wget works is ok.
On Fri Nov 14, 2008 at 22:36:31 +1200, Valery V. Sedletski wrote:
On Fri, 14 Nov 2008 19:59:19 +1200 (MSK), Valery V. Sedletski wrote:
On Wed, 12 Nov 2008 23:13:33 +0100, Adam Lackorzynski wrote:
I think downgrading Linux is not the way fixing this. Knowing the real reason would be very helpful. Did you try strace e.g.?
Strace? What is it? Is this a Linux or Fiasco feature? (ok, silly question, I'll try googling)
I tried booting with init=/bin/sh and executing "strace /sbin/start_udev" after manually mounting /sys and /proc. (udevd exits when /sys is not mounted). It prints a syscall trace, but it get running quickly, and at the end of process I get a Kernel panic message with call trace messages which are filling all screen and I can't see strace output as I can't scroll screen buffer back by Shift-PgUp because linux is dead, Also I tried to redirect strace messages to file but filesystems are mounted readonly and can't write files to disk. Also when panic occurs, linux doesn't sync files with cache. So, I'm in a spot. :( What else can I do?
I'd like to see the panic message, with the backtrace etc. to be better able to judge what's going on. A classical screen-shot would also be ok. Have you considered trying QEmu to play around? This makes all those issues go away because you get the serial output into your terminal etc...
Adam
l4-hackers@os.inf.tu-dresden.de