Hi l4-hackers,
I am currently having some issues with dice-3.3:
First of all, the dice-generated code differs in header- and client file, meaning that the header defines functions with
int DICE_CV somefunction()
while the outcoming c-file is
int somefunction()
This leads to compile errors due to conflicting declarations...
But as soon as you remove DICE_CV, everything compiles fine.
The second problem I have regards IPC-calls:
I am using the latest otc-snapshot (2008-12-10 with l4linux-2.6.27) and have written a small IDL-interface as follows:
dummy.idl: == interface dummy { void hello(void); } ==
The server (running under L4Linux) registers itself at the names-service as "dummy" and starts the server_loop... It implements the dummy_hello_component as follows:
== void dummy_hello_component (CORBA_Object _dice_corba_obj, CORBA_Server_Environment *_dice_corba_env) { /* do nothing*/ printf("Hello\n"); } ==
Additionally, I have written 2 clients, one native L4 application and one based on L4Linux, which implement the client-stub as follows:
== static l4_threadid_t threadId; static CORBA_Environment env; names_query_name("dummy", &threadId); LOG_printf("Dummy found with threadId: %x\n",threadId); dummy_hello_call(&threadId, &env); ==
The native client (started via bmodfs) receives a pagefault, cause it is trying to write?! to address 0: == dummytest| Dummy found with threadId: 11a0001 dummytest| L4RM: [PF] write at 0x00000000, ip 02000272, src 11.02 dummytest| [11.0] l4rm/lib/src/pagefault.c:81:__unknown_pf(): dummytest| unhandled page fault
--PANIC, 'g' for exit------------------------------------IP: 02005755 [dummy.rm] (11.00) jdb: ==
And the L4Linux-client crashes with a segfault:
== dummy_t[42]: segfault at 2 ip 08048312 sp bf4092ec error 6 in dummy_test[8048000+76000] ==
The strange thing is: If I compile the idl-file with dice-3.2.1, the dummy example works, BUT as soon as my interfaces get more complex, e.g., something like this:
int write ([in] long start, [in] long offset, [in] long bytes_to_write, [in, size_is(write_len), max_is(MAX_PAGE_SIZE)] char *write_buffer, [out] long *write_len);
IPC crashes with pagefaults / segfaults again even with dice-3.2.1...
My Fiasco-config is attached.
Do you have any ideas where the problem might be? Do I have to pay attention to specific configuration options, if the IPC-server is running in an L4Linux?
Thanks in advance and best regards,
Marcel Selhorst
Am 18.03.2009 um 17:10 schrieb Marcel Selhorst:
Hi l4-hackers,
Hi Marcel.
I am currently having some issues with dice-3.3:
First of all, the dice-generated code differs in header- and client file, meaning that the header defines functions with
int DICE_CV somefunction()
while the outcoming c-file is
int somefunction()
This leads to compile errors due to conflicting declarations...
But as soon as you remove DICE_CV, everything compiles fine.
It seems that some compilers are indeed not very happy about this. We are looking into it. For the time being please use your workaround (or maybe modify your compiler flags, if you pass additional warning- related flags).
Other comments regarding your IPC problem are below.
The second problem I have regards IPC-calls: [...] == static l4_threadid_t threadId; static CORBA_Environment env;
This is wrong. Please use the DICE_DECLARE_ENV macro to declare and properly initialize the CORBA environment:
DICE_DECLARE_ENV(env);
[...] The native client (started via bmodfs) receives a pagefault, cause it is trying to write?! to address 0: == dummytest| Dummy found with threadId: 11a0001 dummytest| L4RM: [PF] write at 0x00000000, ip 02000272, src 11.02 dummytest| [11.0] l4rm/lib/src/pagefault.c:81:__unknown_pf(): dummytest| unhandled page fault
--PANIC, 'g' for exit------------------------------------IP: 02005755 [dummy.rm] (11.00) jdb: ==
And the L4Linux-client crashes with a segfault:
== dummy_t[42]: segfault at 2 ip 08048312 sp bf4092ec error 6 in dummy_test[8048000+76000] ==
The kernel debugger is very useful to find out where exactly the program crashes. btt11.2 gives you the backtrace for thread 11.2.
The strange thing is: If I compile the idl-file with dice-3.2.1, the dummy example works, BUT as soon as my interfaces get more complex, e.g., something like this:
int write ([in] long start, [in] long offset, [in] long bytes_to_write, [in, size_is(write_len), max_is(MAX_PAGE_SIZE)] char *write_buffer, [out] long *write_len);
IPC crashes with pagefaults / segfaults again even with dice-3.2.1...
Maybe my hint from above fixes this, too. If not, more information is required (complete logs, backtraces, ...).
[...] Thanks in advance and best regards,
I hope this helps.
Carsten
Am 18.03.2009 um 17:10 schrieb Marcel Selhorst:
Hi l4-hackers,
Hi Marcel,
I am currently having some issues with dice-3.3:
First of all, the dice-generated code differs in header- and client file, meaning that the header defines functions with
int DICE_CV somefunction()
while the outcoming c-file is
int somefunction()
This leads to compile errors due to conflicting declarations...
I attached a patch for DICE. Could you check if it fixes this problem?
Btw.: With which flags do you call gcc (we are talking about C here, right?) and which compiler version are you using?
[...]
Carsten
Hi Carsten,
This leads to compile errors due to conflicting declarations...
I attached a patch for DICE. Could you check if it fixes this problem?
yes, thank you, the patch fixed the header problem.
The second problem I have regards IPC-calls:
static CORBA_Environment env;
This is wrong. Please use the DICE_DECLARE_ENV macro to declare and properly initialize the CORBA environment: DICE_DECLARE_ENV(env);
thanks again, this solved my pagefault :)
Now the IPCs are working, but I identified another problem, that I am not able to solve:
I have the following interface:
int register([in, size_is(name_len), max_is(MAX_NAME_LEN)] char *name, [in] long name_len, [in] long in1, [out] long *out1, [out] long out2 );
long read ([in] long in1, [in] long in2, [in] long bytes_to_read, [out, max_is(MAX_PAGE_SIZE)] char read_buffer[] );
The server is implemented as an L4Linux-task. I have three clients:
1) Native L4-client
works :)
2) L4Linux application
works aswell
3) L4Linux kernel module
fails... The kernel module registers itself during the __init phase via the "register" function and this function also works. But as soon as I try to execute a "read"-call, I get:
--Unset id on stack (c)----------------------------------IP: 00401e30 [l4lx.cpu0] (10.05) jdb: g 00000505.00000002 failed CLI: 00000010.00000005
--Unset id on stack (c)----------------------------------IP: 00401e30 [l4lx.cpu0] (10.05) jdb: g 0000050500000002 failed STI: 00000010.00000005
The strange thing is, that the DICE-environmnent is set (otherwise the names_query_name() and the register()-method would fail). I even tried to put the env manually on the stack prior calling via: { DICE_DECLARE_ENV(env); result = read(...); } But that didn't help either.
Do you have any idea, what I might be missing? I thought that this might have something to do with the buffer's memory allocation, I even tried the "prealloc_client" flag in the idl-file and I manually forced dice to use my own CORBA_alloc methods using Linux kernels memory allocation:
void *CORBA_alloc(unsigned long size) return vmalloc(size);
void CORBA_free(void *addr) vfree(addr);
Btw.: With which flags do you call gcc (we are talking about C here, right?) and which compiler version are you using?
gcc-4.1.2 gcc-params: -nostdlib -nostdinc -Wall -Werror -DL4API_l4v2 -I [alotofincludes] dice-3.3.0 dice-params: -fforce-corba-alloc -fforce-c-bindings -nostdinc -P-DL4API_l4v2 -P-I/include [...] and some more includes
Thanks again, Marcel
Am 20.03.2009 um 00:58 schrieb Marcel Selhorst:
Hi Carsten,
Hi!
[...] Now the IPCs are working, but I identified another problem, that I am not able to solve:
I have the following interface:
int register([in, size_is(name_len), max_is(MAX_NAME_LEN)] char *name, [in] long name_len, [in] long in1, [out] long *out1, [out] long out2 );
long read ([in] long in1, [in] long in2, [in] long bytes_to_read, [out, max_is(MAX_PAGE_SIZE)] char read_buffer[] ); [...] 3) L4Linux kernel module
fails... The kernel module registers itself during the __init phase via the "register" function and this function also works. But as soon as I try to execute a "read"-call, I get:
--Unset id on stack (c)----------------------------------IP: 00401e30 [l4lx.cpu0] (10.05) jdb: g 00000505.00000002 failed CLI: 00000010.00000005
--Unset id on stack (c)----------------------------------IP: 00401e30 [l4lx.cpu0] (10.05) jdb: g 0000050500000002 failed STI: 00000010.00000005
The strange thing is, that the DICE-environmnent is set (otherwise the names_query_name() and the register()-method would fail). I even tried to put the env manually on the stack prior calling via: { DICE_DECLARE_ENV(env); result = read(...); } But that didn't help either.
Do you have any idea, what I might be missing?
Hard to tell without knowing any details about what you are doing in the L4Linux kernel. You cannot "just use" IDL server stubs in the L4Linux kernel, they have to run in interrupt threads (the ORe driver in drivers/net/l4ore.c might serve as an example). The error message you get might also indicate a stack overrun (maybe the generated DICE code uses too much stack ... ?).
[...]
Carsten
Hi Carsten,
- L4Linux kernel module
But as soon as I try to execute a "read"-call, I get:
--Unset id on stack (c)----------------------------------IP: 00401e30 [l4lx.cpu0] (10.05) jdb: g 00000505.00000002 failed CLI: 00000010.00000005
Hard to tell without knowing any details about what you are doing in the L4Linux kernel. You cannot "just use" IDL server stubs in the L4Linux kernel, they have to run in interrupt threads (the ORe driver in drivers/net/l4ore.c might serve as an example). The error message you get might also indicate a stack overrun (maybe the generated DICE code uses too much stack ... ?).
I think it is indeed a stack overrun. I have implemented CORBA_alloc() and CORBA_free() to force DICE to allocate the used buffers during IPC via the Linux kernel vmalloc, but it seems, that DICE is not using these methods at all:
My IDL-File looks like this: long read ( [in, size_is(name_len), max_is(MAX_NAME_LEN)] char *name, [in] unsigned long name_len, [in] unsigned long start, [in] unsigned long bytes_to_read, [out, prealloc_client, max_is(MAX_PAGE_SIZE)] unsigned char read_buffer[] );
long write ( [in, size_is(name_len), max_is(MAX_NAME_LEN)] char *name, [in] unsigned long name_len, [in] unsigned long start, [in] unsigned long bytes_to_write, [in, prealloc_client, max_is(MAX_PAGE_SIZE)] unsigned char write_buffer[] );
My according CORBA_alloc/free implementation looks like this:
#include <linux/vmalloc.h>
void *CORBA_alloc(unsigned long size) { return vmalloc(size); }
void CORBA_free(void *addr) { vfree(addr); }
But the generated code doesn't even include a call for CORBA_free, instead it generates:
[...] struct { l4_fpage_t _dice_rcv_fpage; l4_msgdope_t _dice_size_dope; l4_msgdope_t _dice_send_dope; long _dice_opcode; unsigned long name_len; unsigned long start; unsigned long bytes_to_write; unsigned char write_buffer[4096]; char name[100]; } dummy_write_in; [...]
What am I missing? ;)
Thanks, Marcel
Am 26.03.2009 um 12:33 schrieb Marcel Selhorst:
Hi Carsten,
Hello Marcel,
[...] The error message
you get might also indicate a stack overrun (maybe the generated DICE code uses too much stack ... ?).
I think it is indeed a stack overrun. I have implemented CORBA_alloc() and CORBA_free() to force DICE to allocate the used buffers during IPC via the Linux kernel vmalloc, but it seems, that DICE is not using these methods at all:
My IDL-File looks like this: long read ( [in, size_is(name_len), max_is(MAX_NAME_LEN)] char *name, [in] unsigned long name_len, [in] unsigned long start, [in] unsigned long bytes_to_read, [out, prealloc_client, max_is(MAX_PAGE_SIZE)] unsigned char read_buffer[] );
long write ( [in, size_is(name_len), max_is(MAX_NAME_LEN)] char *name, [in] unsigned long name_len, [in] unsigned long start, [in] unsigned long bytes_to_write, [in, prealloc_client, max_is(MAX_PAGE_SIZE)] unsigned char write_buffer[] );
My according CORBA_alloc/free implementation looks like this:
#include <linux/vmalloc.h>
void *CORBA_alloc(unsigned long size) { return vmalloc(size); }
void CORBA_free(void *addr) { vfree(addr); }
But the generated code doesn't even include a call for CORBA_free, instead it generates:
[...] struct { l4_fpage_t _dice_rcv_fpage; l4_msgdope_t _dice_size_dope; l4_msgdope_t _dice_send_dope; long _dice_opcode; unsigned long name_len; unsigned long start; unsigned long bytes_to_write; unsigned char write_buffer[4096]; char name[100]; } dummy_write_in; [...]
Whoops, a message buffer with an array of 4096 bytes will indeed not fit onto a Linux kernel stack.
What am I missing? ;)
The magic keywords here are 'prealloc_server' and 'ref'. They are described in the DICE user manual. I modified your interface as follows:
/* guessed from your generated code */ #define MAX_NAME_LEN 32 #define MAX_PAGE_SIZE 4096
interface dummy { long read ( [in, size_is(name_len), max_is(MAX_NAME_LEN)] char *name, [in] unsigned long name_len, [in] unsigned long start, [in] unsigned long bytes_to_read, [out, ref, prealloc_server, prealloc_client, size_is(MAX_PAGE_SIZE), max_is(MAX_PAGE_SIZE)] unsigned char **read_buffer );
long write ( [in, size_is(name_len), max_is(MAX_NAME_LEN)] char *name, [in] unsigned long name_len, [in] unsigned long start, [in] unsigned long bytes_to_write, [in, ref, prealloc_client, size_is(MAX_PAGE_SIZE), max_is(MAX_PAGE_SIZE)] unsigned char *write_buffer ); };
This instructs DICE to generate code that uses indirect string IPC. This has the desired advantage that the IPC message buffer gets much smaller and your buffers are instead allocated using CORBA_alloc(). The downside is that it might actually be slower for small buffers (e.g., 4096 bytes as you seem to be transferring) due to additional mappings being used by Fiasco. Also, the receive buffer in the server used for your write() function is newly allocated and then freed for each call; doing some manual caching in the CORBA_{alloc,free}() functions might make this a non-issue. The function signatures also changed a bit (client and server side).
Carsten
Hi Carsten,
Whoops, a message buffer with an array of 4096 bytes will indeed not fit onto a Linux kernel stack. The magic keywords here are 'prealloc_server' and 'ref'.
yep, the new interface solved my problem, thank you! ;) Now everything works as expected, although the performance hit is pretty big (about 40% less throughput). Looks like I have to improve my code now ;-)
Another idea was to use shared memory or shared dataspaces. Can you point me to a good example?
Thanks! Marcel
l4-hackers@os.inf.tu-dresden.de