I am writing an ELF analyzer, but I'm having some trouble converting endianness properly. I have functions to determine the endianness of the analyzer and the endiannness of the object file.
Basically, there are four possible scenarios:
Is there a function I can use to explicitly swap byte order/change endianness, since ntohs/l() and htons/l() take the host's endianness into account and sometimes don't convert? Or do I need to find/write my own swap byte order function?
Again, endian-ness does not matter if you have a single byte. If you have one byte, it's the only data you read so there's only one way to interpret it (again, because computers agree on what a byte is).
So Endianness comes into picture when you are sending and receiving data across the network from one host to another host. If the sender and receiver computer have different Endianness, then there is a need to swap the Endianness so that it is compatible.
Broadly speaking, the endianness in use is determined by the CPU. Because there are a number of options, it is unsurprising that different semiconductor vendors have chosen different endianness for their CPUs.
Big-endian is an order in which the "big end" (most significant value in the sequence) is stored first, at the lowest storage address. Little-endian is an order in which the "little end" (least significant value in the sequence) is stored first.
In Linux there are several conversion functions in endian.h, which allow to convert between arbitrary endianness:
uint16_t htobe16(uint16_t host_16bits);
uint16_t htole16(uint16_t host_16bits);
uint16_t be16toh(uint16_t big_endian_16bits);
uint16_t le16toh(uint16_t little_endian_16bits);
uint32_t htobe32(uint32_t host_32bits);
uint32_t htole32(uint32_t host_32bits);
uint32_t be32toh(uint32_t big_endian_32bits);
uint32_t le32toh(uint32_t little_endian_32bits);
uint64_t htobe64(uint64_t host_64bits);
uint64_t htole64(uint64_t host_64bits);
uint64_t be64toh(uint64_t big_endian_64bits);
uint64_t le64toh(uint64_t little_endian_64bits);
Edited, less reliable solution. You can use union to access the bytes in any order. It's quite convenient:
union {
    short number;
    char bytes[sizeof(number)];
};
I think it's worth raising The Byte Order Fallacy article here, by Rob Pyke (one of Go's author).
If you do things right -- ie you do not assume anything about your platforms byte order -- then it will just work. All you need to care about is whether ELF format files are in Little Endian or Big Endian mode.
From the article:
Let's say your data stream has a little-endian-encoded 32-bit integer. Here's how to extract it (assuming unsigned bytes):
i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);
If it's big-endian, here's how to extract it:
i = (data[3]<<0) | (data[2]<<8) | (data[1]<<16) | (data[0]<<24);
And just let the compiler worry about optimizing the heck out of it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With