long structure length depends on byte order?
I
want to use struct.unpack()
in Python 2.7 to get long values from byte strings, but I’m finding some strange behavior, I’m wondering if this is a bug or not, also, what can I do to fix it.
What happens next:
import struct
>>> struct.unpack("L","")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
struct.error: unpack requires a string argument of length 8
This is expected behavior. It requires an 8-byte string to extract an 8-byte long value.
Now I change the byte order
to network byte order and I get the following:
>>> struct.unpack("! L","")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
struct.error: unpack requires a string argument of length 4
An 8-byte long value suddenly only needs 4 bytes.
What’s going on and what can I do to fix this?
Solution
The problem here is that the struct is using your machine’s native long size in the first case, and the “standard”
long
second size of the struct
.
In the first example, you did not specify byte-order, size, and alignment specifier (@=<>! one)
so struct
assumes ‘@’, which uses your machine’s native value, which is your machine's
native long
size. For consistency across platforms, < a href="https://docs.python.org/2/library/struct.html#format-characters" rel="noreferrer noopener nofollow"> struct
defines standard sizes for each type, which may differ from the original dimensions of your machine. All specifiers except ‘@'
use these standard sizes.
So, in the long
case, the standard for struct
is 4 bytes, which is why '! L
‘ (or any '[=<>!] L'
requires 4 bytes. However, the native long
of your machine is obviously 8 bytes, which is why ‘@L
‘ or 'L'
requires 8 bytes. If you want to use the native byte order of the machine, but still be compatible with the standard
size of struct, I recommend that you specify all formats with ‘=
‘ instead of letting Python default to ‘@'
.
You can use struct.calcsize
to check the size expectation, which is:
>>> struct.calcsize('@L'), struct.calcsize('=L')
(This returns (4, 4) on my 64-bit
Windows 10 machine, but returns (8, 4)
on my 64-bit Ubuntu 16.04 machine.) )
You can also directly check the font size of your machine by compiling C
. A script that checks the sizeof(long)
value, for example:
#include <stdio.h>
int main()
{
printf("%li",sizeof(long));
return 0;
}
I followed this guide to compile the script on my
Windows 10 machine.