multiple servers

Wed Jan 27 04:39:04 CET 1999

"Paul Phillips" <pphillips at ivue.com> writes:

> I believe I have found a couple of bugs in Fiasco which were causing the
> multiple server problem, and probably some other problems.  In
> /fiasco/src/time.h :
> the code for set( microseconds):

>From the first impression, your fix looks right.  Thanks for tracking
this down!

I still wonder why I haven't ever seen this problem...  I think I
tried booting L4Linux and `hello' concurrently, but maybe the order in
which they're started makes a difference.

I almost can't believe that I wrote that routine... :-) It must have
been late at night.  I probably didn't invest much thought into it
because I was thinking I was going to rewrite it soon.  The timeout
handling also suffers from at least two other problems: It disables
interrupts while traversing a linear linked list of unknown length,
that is, potentially for an unacceptably long time; and it suffers
from the ``page fault may re-enable interrupts'' problem because it
potentially touches a TCB in a 4-MB kernel region for which a
reference to its shared page table hasn't yet been inserted into the
current task's page directory (this is done ``on demand'' in the page
fault handler).  This is just one more example why doing kernel
synchronization using interrupt-disabling is a bad idea.

[ from a different email ]
> >Why did you relocate it to 0x2400000?  It should be able to run (even
> >concurrently with L4Linux) at the memory location it is linked to
> >using the distributed version of l4/server/hello/Makefile, 0x200000.

> I thought perhaps GRUB  was trying to load both 'hello' and L4Linux at the
> same physical address and that was the reasoning behind my relocating it. It
> makes no difference.

GRUB is never trying to load boot modules to conflicting memory
addresses; the modules (in this case, ELF binaries) are not
interpreted at all---GRUB just locates them linearly behind what has
been loaded as the ``kernel,'' which in our case is Rmgr, the resource 
and boot manager.

Once it has been started by GRUB, Rmgr interprets the ELF binaries and 
loads them to their destination addresses.  Rmgr makes sure these
addresses don't conflict---otherwise you will see an ugly (and not
very informative, I'm afraid) assertion failure.

Thanks again!

Michael

[ code included for reference so that the bug tracking system sees it, too ]

> inline void
> timeout_t::set(unsigned long long abs_microsec)
> {
>   // XXX uses cli/sti
> 
>   _wakeup = abs_microsec;
>   _flags.set = 1;
> 
>   unsigned flags = get_eflags();
>   cli();
> 
>   if (!timer::first_timeout)
>     {
>       timer::first_timeout = this;
>       _prev = _next = 0;
>     }
>   else
>     {
>       timeout_t *ti = timer::first_timeout;
> 
>       for (;;)
>   {
>     if (ti->_wakeup >= _wakeup)
>     {
>     // insert before ti
>     _next = ti;
>     _prev = ti->_prev;
>     ti->_prev = this;
>     if (_prev) _prev->_next = this;
> /***** here you need to set timer::first_timeout since you just replaced it
> */
>      else timer::first_timeout = this;
> /**********/
>     goto done;
>     }
> 
>     if (ti->_next)
>     {
>           ti = ti->_next;
>           continue;
>     }
> /**** here you need to put a break in case ti->wakeup < timeout,
>         and and ti->_next==0 .
>        Otherwise you loop here with interrupts disabled forever */
>    else break;
> /********************/
>    }
> 
>       // insert as last item after ti
>       ti->_next = this;
>       _prev = ti;
>       _next = 0;
>     }
> 
> done:
>   set_eflags(flags);
> }

-- 
hohmuth at innocent.com, hohmuth at sax.de
http://www.sax.de/~hohmuth/