On Monday, 15 May 2023 01:35:26 CEST Paul Boddie wrote:
What should have been a simple rebasing of a few patches, with some even eliminated due to the integration of various changes (including some I suggested several years ago), seems to have turned into a futile exercise with really no obvious indication as to why the code no longer works. To be honest, it is all rather frustrating and disappointing, but such is the way of the world, I guess.
So, I decided to establish the behaviour of the bootstrap code, which I think I have done to a reasonable extent, and to investigate whether it is the kernel that is actually causing the problems. Taking the recent bootstrap code, fixed up so that I can be sure that it is copying modules around correctly (with ERL not set), I took some old kernel versions from the Git repository.
This did not yield a working system, but it did produce output beyond the kernel starting point. Along with the kernel from Subversion r83, I found that the following kernel versions at least started and caused sigma0, moe and ned to be invoked:
commit 1f27c9ebc9651bcd114802ccc9d250866f3742c9 Author: Adam Lackorzynski adam@l4re.org Date: Mon Mar 5 00:00:00 2018 +0000
commit 68eb142cf237292055663ba7635c2c1f2b0ecb92 Author: Jean Wolter jean.wolter@kernkonzept.com Date: Mon Dec 17 00:00:00 2018 +0000
The output they produced was the following line endlessly repeated:
L4Re[rm]: unhandled read page fault at 0x7fff8fa1 pc=0x7fff8fa0
The last kernel in Git that appears to produce a semi-functional system is this one:
commit 9b71796d7fef5644474c94d911c71be65f44783a Author: Frank Mehnert frank.mehnert@kernkonzept.com Date: Mon May 11 00:00:00 2020 +0000
This actually produces the following output which does not repeat endlessly:
Ned says: Hi World! L4Re[rm]: unhandled read page fault at 0x7fff8fa1 pc=0x7fff8fa0 L4Re: rom/ned: Unhandled exception: PC=0x7fff8fa0 PFA=0x7fff8fa0 LdrFlgs=0x0
After that particular commit, I see that there was some work done on the UART functionality. Eventually, it settles down with the following commit:
commit 42c88cc84e183c8234127b4c1e88dc946ca33cb0 Author: Alexander Warg alexander.warg@kernkonzept.com Date: Mon Apr 6 00:00:00 2020 +0000
Fix lib uart related compile issues
* MIPS ci20 * ARM IMX6 UL * ARM SA1100
This commit and any subsequent commits I have tested do not produce output and appear to hang.
I noticed that the changes to the CI20 support introduce uart_16550_dw.o, but all the other boards that introduce such functionality previously had some kind of wrapper code that this object file apparently replaces. I wonder whether this is actually appropriate for the CI20.
I also noticed that the UART peripheral was probably not initialised in the old kernel, with the F_skip_init flag being set in the UART driver. Recent code has eliminated this behaviour, so I decided to explicitly disable initialisation myself. The result was to restore the UART output and to yield the same kind of output as earlier kernel versions, producing the Ned-related error.
Finally, I updated to the latest changeset in my clone of the repository:
commit f5bfa79ff15cbbaadada806d35dbe0560c892b24 Author: Frank Mehnert frank.mehnert@kernkonzept.com Date: Mon Apr 17 00:00:00 2023 +0000
With UART initialisation suppressed, I found that the Ned-related error disappeared and that my program actually ran, which was something of a surprise.
I suppose, then, the conclusion is that the CI20 UART code got inadvertently broken when that functionality was reworked. Whether the UART initialisation should be suppressed or whether the UART can be cleanly reinitialised is something to investigate further, I imagine.
Paul