I have a byte array of data, which should be consistent cross-platform. Let's say I have a pointer, unsigned char* data, which points to some location inside my array, and I want to read 4 bytes into a variable. I would think that I could just to this:
uint32_t my_int = *data;
However, I realize that method doesn't account for endianness. For example, if my data was in big endian, would I have to do this to read it consistently?
uint32_t my_int = (data[0] << 3) + (data[1] << 2) + (data[2] << 1) + data[3];
Likewise, do I have to make the same checks when writing this data with fwrite? For example, if I wrote that same data to a file with this code:
fwrite(&my_int, sizeof(my_int), 1, fh);
Would the resulting data have any known endianness? Or would it be dependent upon architecture? If so, what's the simplest way to do these reads and writes and enforce a particular endianness on all platforms?
You need to worry about endianness whenever reading or writing binary data. You also need to worry about variable size and possibly structure packing if you're trying to read/write entire structs. Some architectures can't handle integers on odd variable boundaries as well, so you can't just grab an integer directly from a binary buffer using something like uint32_t myInteger = *(uint32_t*)bufferPtr++.
There are all sorts of ways to make this work. In the old days, when speed and RAM usage were huge concerns, we would read a chunk of data directly from the file into a buffer and then fix the endianness in-place if needed, using pointers into the structure.
You can still do that today, although structure packing differences between compilers make it a hassle, so it may make more sense to write some simple i/o routines for specific types, such as
int write_integer_at_position( FILE *, size_t position, uint32_t );
int read_integer_from_position( FILE *, size_t position, uint32_t *outResult );
etc
Those routines would swap the bytes if needed, perhaps using htonl, after reading or before writing the data to disk. After you've done this 20 or 30 times you'll probably want to write a data description language of some sort to map between structures in RAM and files. Many people have done it, but I don't think any one in particular has really caught on.
If using integers there is a family of functions/macros
See
ntol for example
As to packing - just define a protocol and where things should be placed. Then write could to construct a character array with the various bits in the correct locations. This should correspond to the code the retrieves those details.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With