Thank you for your prompt answers in my earlier issue. i don't know where i saw the underscore :).
Now i'm having another issue. I'm having a page fault at my program because is trying to acess page 0.
its a simple test program that starts an object that in the constructor starts a thread that uses FLIPS to have network. at the makefile i'm realocatting the program so i think the problem doesn't come from here.
this is my Log:
#######################################################
Welcome to Fiasco(ia32)! L4(v2)/x86 microkernel (C) 1998-2005 TU Dresden Rev: Wed Feb 2 12:05:33 2005 compiled with gcc 2.95 for Intel Pentium 4 Performance-critical config option(s) detected: CONFIG_NDEBUG is off CONFIG_NO_FRAME_PTR is off Found VMware: Using (normal) fully nested PIC mode Using the PIT (i8254) on IRQ 0 for scheduling Absolute KIP Syscalls using: Sysenter CPU: GenuineIntel (F:2:8:9) Model: Pentium 4 (Northwood/Prestonia) at 2793 MHz
128 Entry I TLB (4K or 4M pages) 64 Entry D TLB (4k or 4M pages) 12K µ-ops T Cache (8-way associative) 8 KB L1 D Cache (4-way associative, 64 bytes per line) 512 KB L2 U Cache (8-way associative, 64 bytes per line)
Freeing init code/data: 20480 bytes (5 pages)
SIGMA0: Hello! Found Fiasco: KIP syscalls: yes. Allocated 116kB for maintenance structures.
RMGR: Stage2 running on Fiasco bootloader loaded 7 modules at 02073000-02889a65 total RAM size = 195134 KB (reported by bootloader) received 178096 KB RAM from sigma0 828 KB reserved for RMGR received no I/O ports attached irqs = [ <!0> 1 <!2> 3 <!4> 5 6 7 8 9 a b c d e f ]
RMGR: Starting tasks. #05: loading "(nd)/names" from 02073000-0207c174 to [ 00200000-00206f60 00207000-00210000 ] starting at entry 00200000 via trampoline page code 0006d150 #06: loading "(nd)/dm_phys" from 0207d000-02094174 to [ 01500000-015143a0 01515000-0151e000 ] starting at entry 01500000 via trampoline page code 0006e154 #07: loading "(nd)/l4io" from 02095000-0225e1a6 to [ 00b70000-00b92cf6 00b93000-00bc6000 ] starting at entry 00b70000 via trampoline page code 0006f150 symbols at 0af55000-0af8c000 (220kB), lines at 0af3e000-0af55000 (92kB) #08: loading "(nd)/name_server" from 0225f000-0230478a to [ 0181c000-01838fa3 01839000-01863e40 ] starting at entry 0181c000 via trampoline page code 00070158 symbols at 0af38000-0af3e000 (24kB), lines at 0af23000-0af38000 (84kB) #09: loading "(nd)/flips" from 02305000-027155c4 to [ 01900000-01981cd6 01982000-019a4000 ] starting at entry 01900000 via trampoline page code 00071150 symbols at 0af15000-0af23000 (56kB), lines at 0aeb5000-0af15000 (384kB) #0a: loading "(nd)/mini_ifconfig" from 02716000-027c3ed0 to [ 01a80000-01a9b6d4 01a9c000-01ac48c0 ] starting at entry 01a80000 via trampoline page code 00072158 symbols at 0aeb0000-0aeb5000 (20kB), lines at 0ae9c000-0aeb0000 (80kB) #0b: loading "(nd)/failuredetector" from 027c4000-02889a64 to [ 01d00000-01d1ca74 01d1d000-01d45780 ] starting at entry 01d00000 via trampoline page code 0007315c symbols at 0ae97000-0ae9c000 (20kB), lines at 0ae82000-0ae97000 (84kB)
names | Fiasco detected, registering thread names at kernel *minifcfg| main(): ifconfig *minifcfg| main(): usage: *minifcfg| ifconfig <ifname> <inaddr> <inmask> *minifcfg| main(): setting default value: lo 127.0.0.1 255.0.0.0 *io | OSKit support: using 1024KB at 0x00180000 as heap *flips-0 | OSKit support: using 1024KB at 0x00180000 as heap *flips-0 | Starting FLIPS server *io | PCI: Using configuration type 1 *io | PCI: Probing PCI hardware *io | PCI: Probing PCI hardware (bus 00) *io | PCI: Cannot allocate resource region 4 of device 00:07.1 *io | Limiting direct PCI/PCI transfers. *io | 00000000-ffffffff : PCI mem *io | f4000000-f7ffffff : Intel Corp. 440BX/ZX/DX - 82443BX/ZX/DX Host br *io : idge *io | f8000000-f800001f : BusLogic BT-946C (BA80C30) [MultiMaster 10] *io | f9000000-f9ffffff : PCI device 15ad:0405 (VMWare Inc) *io | fa000000-faffffff : PCI device 15ad:0405 (VMWare Inc) *io | 0000-ffff : PCI IO *io | 0cf8-0cff : PCI conf1 *io | 1000-103f : Intel Corp. 82371AB/EB/MB PIIX4 ACPI *io | 1040-105f : Intel Corp. 82371AB/EB/MB PIIX4 ACPI *io | 1060-107f : BusLogic BT-946C (BA80C30) [MultiMaster 10] *io | 1080-10ff : Advanced Micro Devices [AMD] 79c970 [PCnet32 LANCE] *io | 1400-140f : PCI device 15ad:0405 (VMWare Inc) *io | 1410-141f : Intel Corp. 82371AB/EB/MB PIIX4 IDE *io | omega0/server/src/irq_threads.c:522:attach_irqs(): *io | available irqs=[ <!0> 1 <!2> 3 <!4> 5 6 7 8 9 a b c d e f ] *flips-0 | l4dde_mm_init(): Using ... *flips-0 | 640 kB at 0x00280000 (vmem) *flips-0 | 1024 kB in 1 regions (kmem) *flips-0 | Initializing RT netlink socket *flips-0 | lo: I'm up now. *flips-0 | NET4: Linux TCP/IP 1.0 for NET4.0 *flips-0 | IP Protocols: ICMP, UDP, TCP *flips-0 | IP: routing cache hash table of 128 buckets, 3Kbytes *flips-0 | si_meminfo(): called (1) *flips-0 | TCP: Hash tables configured (established 128 bind 170) *flips-0 | CORBA_alloc(): size=1024, 1 pointers used *flips-0 | CORBA_alloc(): got block at 0x00180004 *flips-0 | CORBA_alloc(): size=1024, 2 pointers used *flips-0 | CORBA_alloc(): got block at 0x0018040c *flips-0 | flips_session_thread(): Flips(session_thread): new thread_id 9.8 *flips-0 | *flips-0 | CORBA_alloc(): size=1024, 3 pointers used *flips-0 | CORBA_alloc(): got block at 0x00180814 *flips-0 | CORBA_alloc(): size=1024, 4 pointers used *flips-0 | CORBA_alloc(): got block at 0x00180c1c *minifcfg| print_flags(): FLAGS for lo: LOOPBACK *minifcfg| print_flags(): FLAGS for lo: UP LOOPBACK RUNNING *minifcfg| print_flags(): FLAGS for lo: LOOPBACK *minifcfg| print_flags(): FLAGS for lo: UP LOOPBACK RUNNING *minifcfg| ifconfig(): lo: inet 127.0.0.1 *minifcfg| *minifcfg| ifconfig(): lo: netmask 255.0.0.0 *minifcfg| main(): register at names *minifcfg| main(): eternal sleep *flips-0 | flips_session_thread(): Flips(session_thread): new thread_id 9.9 *flips-0 | *flips-0 | CORBA_alloc(): size=1024, 5 pointers used *flips-0 | CORBA_alloc(): got block at 0x00181024 *flips-0 | CORBA_alloc(): size=1024, 6 pointers used *flips-0 | CORBA_alloc(): got block at 0x0018142c *failured| L4RM: [PF] write at 0x00000000, eip 01d00249, src B.03 *failured| [B.0] l4rm/lib/src/pagefault.c:78:__unknown_pf(): *failured| unhandled page fault --------------------------------------------------------EIP: 01d0c765 PANIC #######################################################
here is my code:
#include "WOO_vector.h" #include "FailureDetector.h" #include <l4/lock/lock.h> #include <l4/names/libnames.h>
int main(){ vector<int> grupo; l4lock_t lock = L4LOCK_UNLOCKED;
grupo.push_back(1); grupo.push_back(2); grupo.push_back(3);
FailureDetector fd(1, 500000, &grupo, &lock); }
how can i avoid this problem?
thank you
Tiago
On Tuesday 15 March 2005 17:08, Tiago Jorge wrote:
Thank you for your prompt answers in my earlier issue. i don't know where i saw the underscore :).
:-)
Now i'm having another issue. I'm having a page fault at my program because is trying to acess page 0. its a simple test program that starts an object that in the constructor starts a thread that uses FLIPS to have network. at the makefile i'm realocatting the program so i think the problem doesn't come from here.
this is my Log:
If you add the log server to your menu.lst, the ``*'' disapper.
#######################################################
Welcome to Fiasco(ia32)! L4(v2)/x86 microkernel (C) 1998-2005 TU Dresden Rev: Wed Feb 2 12:05:33 2005 compiled with gcc 2.95 for Intel Pentium 4 Performance-critical config option(s) detected: CONFIG_NDEBUG is off CONFIG_NO_FRAME_PTR is off Found VMware: Using (normal) fully nested PIC mode Using the PIT (i8254) on IRQ 0 for scheduling Absolute KIP Syscalls using: Sysenter CPU: GenuineIntel (F:2:8:9) Model: Pentium 4 (Northwood/Prestonia) at 2793 MHz
128 Entry I TLB (4K or 4M pages) 64 Entry D TLB (4k or 4M pages) 12K µ-ops T Cache (8-way associative) 8 KB L1 D Cache (4-way associative, 64 bytes per line) 512 KB L2 U Cache (8-way associative, 64 bytes per line)
Freeing init code/data: 20480 bytes (5 pages)
SIGMA0: Hello! Found Fiasco: KIP syscalls: yes. Allocated 116kB for maintenance structures.
RMGR: Stage2 running on Fiasco bootloader loaded 7 modules at 02073000-02889a65 total RAM size = 195134 KB (reported by bootloader) received 178096 KB RAM from sigma0 828 KB reserved for RMGR received no I/O ports attached irqs = [ <!0> 1 <!2> 3 <!4> 5 6 7 8 9 a b c d e f ]
RMGR: Starting tasks. #05: loading "(nd)/names" from 02073000-0207c174 to [ 00200000-00206f60 00207000-00210000 ] starting at entry 00200000 via trampoline page code 0006d150 #06: loading "(nd)/dm_phys" from 0207d000-02094174 to [ 01500000-015143a0 01515000-0151e000 ] starting at entry 01500000 via trampoline page code 0006e154 #07: loading "(nd)/l4io" from 02095000-0225e1a6 to [ 00b70000-00b92cf6 00b93000-00bc6000 ] starting at entry 00b70000 via trampoline page code 0006f150 symbols at 0af55000-0af8c000 (220kB), lines at 0af3e000-0af55000 (92kB) #08: loading "(nd)/name_server" from 0225f000-0230478a to [ 0181c000-01838fa3 01839000-01863e40 ] starting at entry 0181c000 via trampoline page code 00070158 symbols at 0af38000-0af3e000 (24kB), lines at 0af23000-0af38000 (84kB) #09: loading "(nd)/flips" from 02305000-027155c4 to [ 01900000-01981cd6 01982000-019a4000 ] starting at entry 01900000 via trampoline page code 00071150 symbols at 0af15000-0af23000 (56kB), lines at 0aeb5000-0af15000 (384kB) #0a: loading "(nd)/mini_ifconfig" from 02716000-027c3ed0 to [ 01a80000-01a9b6d4 01a9c000-01ac48c0 ] starting at entry 01a80000 via trampoline page code 00072158 symbols at 0aeb0000-0aeb5000 (20kB), lines at 0ae9c000-0aeb0000 (80kB) #0b: loading "(nd)/failuredetector" from 027c4000-02889a64 to [ 01d00000-01d1ca74 01d1d000-01d45780 ] starting at entry 01d00000 via trampoline page code 0007315c symbols at 0ae97000-0ae9c000 (20kB), lines at 0ae82000-0ae97000 (84kB)
names | Fiasco detected, registering thread names at kernel *minifcfg| main(): ifconfig *minifcfg| main(): usage: *minifcfg| ifconfig <ifname> <inaddr> <inmask> *minifcfg| main(): setting default value: lo 127.0.0.1 255.0.0.0 *io | OSKit support: using 1024KB at 0x00180000 as heap *flips-0 | OSKit support: using 1024KB at 0x00180000 as heap *flips-0 | Starting FLIPS server *io | PCI: Using configuration type 1 *io | PCI: Probing PCI hardware *io | PCI: Probing PCI hardware (bus 00) *io | PCI: Cannot allocate resource region 4 of device 00:07.1 *io | Limiting direct PCI/PCI transfers. *io | 00000000-ffffffff : PCI mem *io | f4000000-f7ffffff : Intel Corp. 440BX/ZX/DX - 82443BX/ZX/DX Host br *io : idge *io | f8000000-f800001f : BusLogic BT-946C (BA80C30) [MultiMaster 10] *io | f9000000-f9ffffff : PCI device 15ad:0405 (VMWare Inc) *io | fa000000-faffffff : PCI device 15ad:0405 (VMWare Inc) *io | 0000-ffff : PCI IO *io | 0cf8-0cff : PCI conf1 *io | 1000-103f : Intel Corp. 82371AB/EB/MB PIIX4 ACPI *io | 1040-105f : Intel Corp. 82371AB/EB/MB PIIX4 ACPI *io | 1060-107f : BusLogic BT-946C (BA80C30) [MultiMaster 10] *io | 1080-10ff : Advanced Micro Devices [AMD] 79c970 [PCnet32 LANCE] *io | 1400-140f : PCI device 15ad:0405 (VMWare Inc) *io | 1410-141f : Intel Corp. 82371AB/EB/MB PIIX4 IDE *io | omega0/server/src/irq_threads.c:522:attach_irqs(): *io | available irqs=[ <!0> 1 <!2> 3 <!4> 5 6 7 8 9 a b c d e f ] *flips-0 | l4dde_mm_init(): Using ... *flips-0 | 640 kB at 0x00280000 (vmem) *flips-0 | 1024 kB in 1 regions (kmem) *flips-0 | Initializing RT netlink socket *flips-0 | lo: I'm up now. *flips-0 | NET4: Linux TCP/IP 1.0 for NET4.0 *flips-0 | IP Protocols: ICMP, UDP, TCP *flips-0 | IP: routing cache hash table of 128 buckets, 3Kbytes *flips-0 | si_meminfo(): called (1) *flips-0 | TCP: Hash tables configured (established 128 bind 170) *flips-0 | CORBA_alloc(): size=1024, 1 pointers used *flips-0 | CORBA_alloc(): got block at 0x00180004 *flips-0 | CORBA_alloc(): size=1024, 2 pointers used *flips-0 | CORBA_alloc(): got block at 0x0018040c *flips-0 | flips_session_thread(): Flips(session_thread): new thread_id 9.8 *flips-0 | *flips-0 | CORBA_alloc(): size=1024, 3 pointers used *flips-0 | CORBA_alloc(): got block at 0x00180814 *flips-0 | CORBA_alloc(): size=1024, 4 pointers used *flips-0 | CORBA_alloc(): got block at 0x00180c1c *minifcfg| print_flags(): FLAGS for lo: LOOPBACK *minifcfg| print_flags(): FLAGS for lo: UP LOOPBACK RUNNING *minifcfg| print_flags(): FLAGS for lo: LOOPBACK *minifcfg| print_flags(): FLAGS for lo: UP LOOPBACK RUNNING *minifcfg| ifconfig(): lo: inet 127.0.0.1 *minifcfg| *minifcfg| ifconfig(): lo: netmask 255.0.0.0 *minifcfg| main(): register at names *minifcfg| main(): eternal sleep *flips-0 | flips_session_thread(): Flips(session_thread): new thread_id 9.9 *flips-0 | *flips-0 | CORBA_alloc(): size=1024, 5 pointers used *flips-0 | CORBA_alloc(): got block at 0x00181024 *flips-0 | CORBA_alloc(): size=1024, 6 pointers used *flips-0 | CORBA_alloc(): got block at 0x0018142c *failured| L4RM: [PF] write at 0x00000000, eip 01d00249, src B.03 *failured| [B.0] l4rm/lib/src/pagefault.c:78:__unknown_pf(): *failured| unhandled page fault --------------------------------------------------------EIP:
I have no idea, but you could debug this issue yourself. At this point you are in the Fiasco kernel debugger. Type
utb<SPACE>01d00249
(this means: disassemble at task/address space 0xb at address 01d00249).
And look at the output. Since you have loaded the symbols and lines, you should see the faulting source line in the disassmbly output. Just scroll the output a few lines up by hitting the Up-Arrow key (or Page-Up).
If you don't see the error, just post the output of the above command.
Frank
I have no idea, but you could debug this issue yourself. At this point you are in the Fiasco kernel debugger. Type
utb<SPACE>01d00249
(this means: disassemble at task/address space 0xb at address 01d00249).
And look at the output. Since you have loaded the symbols and lines, you should see the faulting source line in the disassmbly output. Just scroll the output a few lines up by hitting the Up-Arrow key (or Page-Up).
If you don't see the error, just post the output of the above command.
like you suggested i've disassembled the program, the output in the error zone is this:
/home/tiago/l4/pkg/failuredetector/server/src/FailureDetector.cc:30 01d00244 add $0xc,%esp 01d00247 push $0x0 01d00249 push $0x2 01d0024b push $0x2 01d0024d call 0x1d01300 <socket> 01d00252 add $0x10,%esp 01d00255 test %eax,%eax 01d00257 mov %eax,(%ebx) <--- ERROR HERE!!!!!! 01d00259 js 0x1d00344 /home/tiago/l4/pkg/failuredetector/server/src/FailureDetector.cc:38
the code is this one (i'll show the sequence of execution):
the program starts and invokes a function that starts an l4 thread using the short version. the error lines are in the first function called by the thread (lines 30 to 38):
int flags=0;
printf("TOU ANTES DO SOCKET!!!!!!!\n"); /* Create socket from which to send */ if ((sock_fd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) { perror("open error on socket"); exit(1); printf("ERRROOOOOOOOO!!!!!!!\n"); }
i've thought it was from perror and i've comment this block, but it gave the same error in the next lines.
must i reallocate the internal program thread? if so, how can i do it?
thanks
Tiago
On Wednesday 16 March 2005 12:34, Tiago Jorge wrote:
like you suggested i've disassembled the program, the output in the error zone is this:
/home/tiago/l4/pkg/failuredetector/server/src/FailureDetector.cc:30 01d00244 add $0xc,%esp 01d00247 push $0x0 01d00249 push $0x2 01d0024b push $0x2 01d0024d call 0x1d01300 <socket> 01d00252 add $0x10,%esp 01d00255 test %eax,%eax 01d00257 mov %eax,(%ebx) <--- ERROR HERE!!!!!! 01d00259 js 0x1d00344 /home/tiago/l4/pkg/failuredetector/server/src/FailureDetector.cc:38
That means that the return value of the socket() call is stored somewhere (in sock_fd). What ist sock_fd -- an object variable, a local variable or a global variable? If it is an global variable, I assume the the object was not constructed. Perhaps, sock_fd is part of a static object?
the code is this one (i'll show the sequence of execution):
the program starts and invokes a function that starts an l4 thread using the short version. the error lines are in the first function called by the thread (lines 30 to 38):
int flags=0;
printf("TOU ANTES DO SOCKET!!!!!!!\n"); /* Create socket from which to send */ if ((sock_fd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) { perror("open error on socket"); exit(1); printf("ERRROOOOOOOOO!!!!!!!\n"); }
i've thought it was from perror and i've comment this block, but it gave the same error in the next lines.
must i reallocate the internal program thread? if so, how can i do it?
Please post a little bit more, at least the definition of sock_fd or even the whole file.
Frank
01d00244 add $0xc,%esp 01d00247 push $0x0 01d00249 push $0x2 01d0024b push $0x2 01d0024d call 0x1d01300 <socket> 01d00252 add $0x10,%esp 01d00255 test %eax,%eax 01d00257 mov %eax,(%ebx) <--- ERROR HERE!!!!!! 01d00259 js 0x1d00344 /home/tiago/l4/pkg/failuredetector/server/src/FailureDetector.cc:38
That means that the return value of the socket() call is stored somewhere (in sock_fd). What ist sock_fd -- an object variable, a local variable or a global variable?
thats true. sock_fd is an object variable. the definition of the class is the following:
#ifndef FAILUREDETECTOR_H #define FAILUREDETECTOR_H
#include "WOO_vector.h" #include "WOO_map.h" #include <string> #include <stdlib.h> #include <string.h>
#include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <fcntl.h> #include <netdb.h> #include <stdio.h>
//TODO: deixar de usar pthread e passar a usar l4thread #include <l4/thread/thread.h> #include <l4/lock/lock.h>
#define PORTNUMBER 45555
#define NULL 0
using namespace std;
struct heartbeat{ int eid; };
class FailureDetector{
private: int sock_fd; <<---- HERE IS SOCK_FD socklen_t addrLength_fd; struct sockaddr_in destinationAddr_fd; struct sockaddr_in myAddr_fd; struct sockaddr_in responseAddr_fd; int trueFlag_fd; map<int,heartbeat> suspected;
l4lock_t lock_suspected; l4lock_t lock_stop;
bool stop_flag;
l4thread_t fdect;
//params int eid; long sleep_time; vector<int> *group;
l4lock_t *group_lock;
public:
private: void initialize_comm_fd(); public: //Constructor //TODO: l4 lock FailureDetector(int new_eid, int new_sleep_time, vector<int> *new_group, l4lock_t *new_group_lock);
void dump_suspicious(); map<int, heartbeat> get_Suspected(); bool is_Suspected(int eid); bool to_stop(); void stop(); void failureDetector(); };
void helper_failure_detector(void *args);
#endif // FAILUREDETECTOR_H
If it is an global variable, I assume the the object was not constructed. Perhaps, sock_fd is part of a static object?
the code is this one (i'll show the sequence of execution):
the program starts and invokes a function that starts an l4 thread using the short version. the error lines are in the first function called by the thread (lines 30 to 38):
int flags=0;
printf("TOU ANTES DO SOCKET!!!!!!!\n"); /* Create socket from which to send */ if ((sock_fd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) { perror("open error on socket"); exit(1); printf("ERRROOOOOOOOO!!!!!!!\n"); }
i've thought it was from perror and i've comment this block, but it gave the same error in the next lines.
must i reallocate the internal program thread? if so, how can i do it?
Please post a little bit more, at least the definition of sock_fd or even the whole file.
Frank
thanks for the patience
Tiago
l4-hackers mailing list l4-hackers@os.inf.tu-dresden.de http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers
On Wednesday 16 March 2005 14:41, Tiago Jorge wrote:
01d00244 add $0xc,%esp 01d00247 push $0x0 01d00249 push $0x2 01d0024b push $0x2 01d0024d call 0x1d01300 <socket> 01d00252 add $0x10,%esp 01d00255 test %eax,%eax 01d00257 mov %eax,(%ebx) <--- ERROR HERE!!!!!! 01d00259 js 0x1d00344 /home/tiago/l4/pkg/failuredetector/server/src/FailureDetector.cc:38
That means that the return value of the socket() call is stored somewhere (in sock_fd). What ist sock_fd -- an object variable, a local variable or a global variable?
thats true. sock_fd is an object variable. the definition of the class is the following:
Where is the related object initialized? Add a printf()-statement to the constructor of FailureDetector and see if the output of that statement occurs _before_ the error.
Frank
Frank Mehnert wrote:
On Wednesday 16 March 2005 14:41, Tiago Jorge wrote:
01d00244 add $0xc,%esp 01d00247 push $0x0 01d00249 push $0x2 01d0024b push $0x2 01d0024d call 0x1d01300 <socket> 01d00252 add $0x10,%esp 01d00255 test %eax,%eax 01d00257 mov %eax,(%ebx) <--- ERROR HERE!!!!!! 01d00259 js 0x1d00344 /home/tiago/l4/pkg/failuredetector/server/src/FailureDetector.cc:38
That means that the return value of the socket() call is stored somewhere (in sock_fd). What ist sock_fd -- an object variable, a local variable or a global variable?
thats true. sock_fd is an object variable. the definition of the class is the following:
Where is the related object initialized? Add a printf()-statement to the constructor of FailureDetector and see if the output of that statement occurs _before_ the error.
Frank
ok... i've added some debug and you are right. the error is on the call of the socket function. the constructor is right. here is the code of the constructor:
FailureDetector::FailureDetector(int new_eid, int new_sleep_time, vector<int> *new_group, //pthread_mutex_t *new_group_lock){
l4lock_t *new_group_lock){ eid = new_eid; sleep_time = new_sleep_time; group = new_group; group_lock = new_group_lock; trueFlag_fd = 0x1; stop_flag = false; sock_fd=0;
LOG("Before start thread\n"); fdect = l4thread_create(helper_failure_detector, NULL, L4THREAD_CREATE_ASYNC); LOG("After start thread\n"); }
it passes ok...
then he executes this code:
void FailureDetector::failureDetector(){ heartbeat hb; bool recvfrom_error = false; map<int,heartbeat> received; map<int,heartbeat> pre_suspicious;
/*initialization procedures*/ //pthread_mutex_init(&lock_suspected, NULL); //pthread_mutex_init(&lock_stop, NULL); LOG("before initialize_comm_fd()\n"); initialize_comm_fd();
[...]
and the initialize_comm_fd bugs out. here is the code:
void FailureDetector::initialize_comm_fd(){ int flags=0, s=0;
LOG("before socket\n"); /* Create socket from which to send */ if ((sock_fd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) { <--- ERROR!!!!! LOG("socket error\n"); } LOG("after socket\n"); [...]
you can see there the s variable. if i use s, it works out fine because its a local variable, but breaks in the next line. sock_fd and the variable in the next line are private attributes of the class, as you can see here:
[...]
extern "C"{ #include <stdio.h> #include <string.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <fcntl.h> #include <netdb.h> #include <l4/lock/lock.h> #include <l4/names/libnames.h> #include <l4/log/l4log.h> }
#define PORTNUMBER 45555
#define NULL 0
using namespace std;
struct heartbeat{ int eid; };
class FailureDetector{
private: int sock_fd; <--- here it is... socklen_t addrLength_fd; struct sockaddr_in destinationAddr_fd; struct sockaddr_in myAddr_fd; struct sockaddr_in responseAddr_fd; int trueFlag_fd; map<int,heartbeat> suspected; [...]
so... whats wrong :) ???
thanks
Tiago
On Wednesday 16 March 2005 16:33, Tiago Jorge wrote:
i've added some debug and you are right. the error is on the call of the socket function. the constructor is right. here is the code of the constructor:
FailureDetector::FailureDetector(int new_eid, int new_sleep_time, vector<int> *new_group, //pthread_mutex_t *new_group_lock){
l4lock_t *new_group_lock){ eid = new_eid; sleep_time = new_sleep_time; group = new_group; group_lock = new_group_lock; trueFlag_fd = 0x1; stop_flag = false; sock_fd=0;
LOG("Before start thread\n"); fdect = l4thread_create(helper_failure_detector, NULL, L4THREAD_CREATE_ASYNC); LOG("After start thread\n"); }
it passes ok...
then he executes this code:
How is FailureDetector::failureDetector() called?
void FailureDetector::failureDetector(){ heartbeat hb; bool recvfrom_error = false; map<int,heartbeat> received; map<int,heartbeat> pre_suspicious;
/*initialization procedures*/ //pthread_mutex_init(&lock_suspected, NULL); //pthread_mutex_init(&lock_stop, NULL); LOG("before initialize_comm_fd()\n"); initialize_comm_fd();
[...]
and the initialize_comm_fd bugs out. here is the code:
void FailureDetector::initialize_comm_fd(){ int flags=0, s=0;
LOG("before socket\n"); /* Create socket from which to send */ if ((sock_fd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) { <--- ERROR!!!!! LOG("socket error\n"); } LOG("after socket\n"); [...]
Frank
How is FailureDetector::failureDetector() called?
ok... with your tip i've located the problem. the problem is that Java is to easy and gives us all the errors and stuff and c++ doesn't :):) (oh... and i'm dumb).
this is a member function... so to start a thread i use this function:
void helper_failure_detector(void *args){ FailureDetector *fdobj = (FailureDetector *)args; fdobj->failureDetector(); }
and i start the thread like this (like in the example):
-->fdect = l4thread_create(helper_failure_detector, NULL, L4THREAD_CREATE_ASYNC);
but as the signature changes from the pthread syntax so i've screwed up. it should be:
-->fdect = l4thread_create(helper_failure_detector, (void *)this, L4THREAD_CREATE_ASYNC);
so he was doing: NULL->failureDetector();... i don't know how he allow it but... divine intervention i think...
in conclusion... i'm dumb :)
thanks for all your patience and sorry...
Tiago
l4-hackers@os.inf.tu-dresden.de