Hi Paul,
I believe I have an idea of what you are observing.
00 00 00 00 ce e3 08 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...
opcode ??? offset hot_spot
On a 32bit architecture, the opcode is 32bit, the ??? is 32bit padding before the offset, which is - also on a 32bit architecture - a 64bit value and needs to be aligned to 64bit. The client has to adhere to the platform specific alignment constraints of the members in the data structure, as does the server side.
From what I have understood, you are implementing a dataspace provider and are observing the provider generating the error (-39/-L4_EBADPROTO)?
EBADPROTO normaly means that the protocol value in the l4_msgtag_t is not supported by the server. In case of an unsupported opcode servers reports L4_ENOSYS. Thus, I'm a bit confused about the EBADPROTO error code and the opcode/MR issue. What am I missing?
Cheers Philipp
On 9/28/22 18:36, Philipp Eppelt wrote:
Hi Paul,
thanks for the detailed explanation and sorry for the long wait. I waded through the templates and am wondering how this happens. I too am of the opinion that on a 32bit platform the opcode is a 32-bit number and it always is written to the first field in the MRs. (Currently, I'd say it's an int).
I need to build a test for to reproduce this to figure out what happens within the templates, but I'm not sure if I understood the details of your explanation, so let me put it in my own words:
- You observe the opcode of a Dataspace::map operation to be written
into a 64bit field in the MRs, but only in the lower 32bit. The upper 32bit remain the old value.
- This happens only with IPC using the RPC framework. In IPC using code
that is written by hand - like L4::Factory.create() or L4::Task.map() - the opcode-field is a 32bit MR; on 64-bit architectures this is a 64bit value and field respectively.
- As far as I understood you are writing the client-side of the
Dataspace.map() request yourself and do not use the RPC framework? With that I mean writing the opcode to mr[0], the offset to mr[1], and so on.
But the server side then reads the mr[0] and mr[1] together as one 64bit value and then replies with EBADPROTO?
On this last point I'm a bit lost on which side does what exactly. Can you maybe write a bit of pseudo code on what happens on client and what on server side to help me understand?
Cheers Philipp
On 9/18/22 23:15, Paul Boddie wrote:
On Monday, 29 August 2022 00:37:31 CEST Paul Boddie wrote:
On Sunday, 28 August 2022 23:07:38 CEST Adam Lackorzynski wrote:
On Sun Aug 21, 2022 at 00:18:57 +0200, Paul Boddie wrote:
I just spent quite some time seeing errors like this...
ext2svr | L4Re[rm]: mapping for page fault failed with error -39 at 0x1002fbc00 pc=0x10b7804 ext2svr | L4Re: rom/ext2_server: Unhandled exception: PC=0x10b7804 PFA=0x1002fbc00 LdrFlgs=0x0
-39 being -L4_EBADPROTO (unsupported protocol), of course.
[...]
Do you have any ideas as to why the first message register gets corrupted?
No, still not. Any chance I could see a small example of this?
I'll try and package up what I've been doing so that it can be more readily investigated. I was actually in the middle of this packaging process when I discovered the problem once again.
Following up, I decided to give my code a test in 32-bit x86 and MIPS virtual machines which caused the problem to be much more pronounced. This led me to review a few things where I had misread the definitions of certain types (in pkg/l4re-core/l4sys/include/l4int.h). However, I think that the nature of the problem is actually as follows.
When a map request is sent by the L4Re region mapper, the IPC framework pieces together the necessary message. What isn't entirely obvious is the nature of the opcode being used. I originally thought that it was of type l4_umword_t, and dumping the bytes in the message, it does appear that the opcode is actually a 32-bit value (compatible with l4_umword_t on a 32-bit platform) but that the first operand only appears after an initial 64-bit unit containing the opcode, even on a 32-bit platform.
This might not produce problems on 64-bit platforms, although my original report did concern such a platform, but problems are immediately evident on a 32-bit platform. For example, here is a map request on x86 that caused problems:
00 00 00 00 ce e3 08 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ...
opcode ??? offset hot_spot
So, when the IPC framework writes the opcode, it fills the first 32-bit unit but apparently not the second half of the initial 64-bit unit. Consequently, the contents of the second half of the word appear to persist from whatever they were previously. Hence my annotation of "???" above. It looks like an address from the program, perhaps an earlier page fault address.
Often, zero bytes are involved, thus preserving the appropriate behaviour even on 64-bit platforms where the opcode could be interpreted as the first 64-bit (l4_umword_t) unit. Zeroing the first message register (as I noted in an earlier message) fixed any cases where the prior non-zero contents leaked into the message and corrupted the opcode when considered to be a 64-bit value.
The problem on 32-bit platforms is that the operands are displaced and are not found after the first 32-bit word. In the above example, this causes the offset operand to be misinterpreted, along with the values that follow it. Obviously, this brings programs needing dataspaces to a halt very quickly.
The above behaviour contradicts the way the IPC messages are constructed by Lua, for example. What I see on amd64/x86-64 is different from on x86 for the same message payload. For example, an invocation of a factory create operation with an opcode of 6:
On amd64:
06 00 00 00 00 00 00 00 81 00 08 00 00 00 00 00 e8 03 00 00 00 00 00 00 ...
opcode type size unused value (1000)
On x86:
06 00 00 00 81 00 04 00 e8 03 00 00 ...
opcode type size value (1000)
Here, the opcode is dependent on the word size, and the Lua code is happy to use the word size for the operands with no padding or gaps being introduced. Other IPC messages also appear to use the word size for the opcode. For example, when attach operations are invoked on the region mapper I have implemented, the opcode is only a 32-bit value with no trailing data before the initial operands.
I imagine that none of this would manifest itself if I used precisely the same libraries and/or code as other L4Re components, but then that rather makes the system monolithic. There should be a degree of interoperability based on message specifications and interface descriptions, and there should be some consistency, too. What I have seen is that the dataspace IPC is not consistent with other IPC.
Paul
l4-hackers mailing list l4-hackers@os.inf.tu-dresden.de https://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers