Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling endianness when reading from a byte array in C?

Tags:

c

io

endianness

I have a byte array of data, which should be consistent cross-platform. Let's say I have a pointer, unsigned char* data, which points to some location inside my array, and I want to read 4 bytes into a variable. I would think that I could just to this:

uint32_t my_int = *data;

However, I realize that method doesn't account for endianness. For example, if my data was in big endian, would I have to do this to read it consistently?

uint32_t my_int = (data[0] << 3) + (data[1] << 2) + (data[2] << 1) + data[3];

Likewise, do I have to make the same checks when writing this data with fwrite? For example, if I wrote that same data to a file with this code:

fwrite(&my_int, sizeof(my_int), 1, fh);

Would the resulting data have any known endianness? Or would it be dependent upon architecture? If so, what's the simplest way to do these reads and writes and enforce a particular endianness on all platforms?

like image 835
Alexis King Avatar asked Dec 04 '25 23:12

Alexis King


2 Answers

You need to worry about endianness whenever reading or writing binary data. You also need to worry about variable size and possibly structure packing if you're trying to read/write entire structs. Some architectures can't handle integers on odd variable boundaries as well, so you can't just grab an integer directly from a binary buffer using something like uint32_t myInteger = *(uint32_t*)bufferPtr++.

There are all sorts of ways to make this work. In the old days, when speed and RAM usage were huge concerns, we would read a chunk of data directly from the file into a buffer and then fix the endianness in-place if needed, using pointers into the structure.

You can still do that today, although structure packing differences between compilers make it a hassle, so it may make more sense to write some simple i/o routines for specific types, such as

int write_integer_at_position( FILE *, size_t position, uint32_t );
int read_integer_from_position( FILE *, size_t position, uint32_t *outResult );
etc

Those routines would swap the bytes if needed, perhaps using htonl, after reading or before writing the data to disk. After you've done this 20 or 30 times you'll probably want to write a data description language of some sort to map between structures in RAM and files. Many people have done it, but I don't think any one in particular has really caught on.

like image 140
EricS Avatar answered Dec 06 '25 13:12

EricS


If using integers there is a family of functions/macros

See

ntol for example

As to packing - just define a protocol and where things should be placed. Then write could to construct a character array with the various bits in the correct locations. This should correspond to the code the retrieves those details.

like image 32
Ed Heal Avatar answered Dec 06 '25 13:12

Ed Heal



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!