I know the WAV file format uses signed integers for 16-bit samples. It also stores them in little-endian order, meaning the lowest 8 bits come first, then the next, etc. Is the special sign bit on the first byte, or is the special sign bit always on the most significant bit (highest value)?
Meaning:
Which one is the sign bit in the WAV format?
++---+---+---+---+---+---+---+---++---+---+---+---+---+---+---+---++
|| a | b | c | d | e | f | g | h || i | j | k | l | m | n | o | p ||
++---+---+---+---+---+---+---+---++---+---+---+---+---+---+---+---++
--------------------------- here -> ^ ------------- or here? -> ^
i or p?
Specifically, little-endian is when the least significant bytes are stored before the more significant bytes, and big-endian is when the most significant bytes are stored before the less significant bytes. When we write a number (in hex), i.e. 0x12345678 , we write it with the most significant byte first (the 12 part).
Although many processors use little-endian storage for all types of data (integer, floating point), there are a number of hardware architectures where floating-point numbers are represented in big-endian form while integers are represented in little-endian form.
A signed integer is a 32-bit datum that encodes an integer in the range [-2147483648 to 2147483647]. An unsigned integer is a 32-bit datum that encodes a nonnegative integer in the range [0 to 4294967295].
For example, if 4F is stored at storage address 1000, 52 will be at address 1001. In a little-endian system, it would be stored as 524F, with 52 at address 1000 and 4F at 1001.
signed int, little endian:
byte 1(lsb)       byte 2(msb)
---------------------------------
7|6|5|4|3|2|1|0 | 7|6|5|4|3|2|1|0|
----------------------------------
                  ^
                  | 
                 Sign bit
You only need to concern yourself with that when reading/writing a short int to some external media. Within your program, the sign bit is the most significant bit in the short, no matter if you're on a big or little endian platform.
The sign bit is the most significant bit on any two's-complement machine (like the x86), and thus will be in the last byte in a little-endian format
Just cause i didn't want to be the one not including ASCII art... :)
+---------------------------------------+---------------------------------------+
|              first byte               |              second byte              |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
|  0 |  1 |  2 |  3 |  4 |  5 |  6 |  7 |  8 |  9 | 10 | 11 | 12 | 13 | 14 | 15 |
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
   ^--- lsb                                               msb / sign bit -----^
Bits are basically represented "backwards" from how most people think about them, which is why the high byte is last. But it's all consistent; "bit 15" comes after "bit 0" just as addresses ought to work, and is still the most significant bit of the most significant byte of the word. You don't have to do any bit twiddling, because the hardware talks in terms of bytes at all but the lowest levels -- so when you read a byte, it looks exactly like you'd expect. Just look at the most significant bit of your word (or the last byte of it, if you're reading a byte at a time), and there's your sign bit.
Note, though, that two's complement doesn't exactly designate a particular bit as the "sign bit". That's just a very convenient side effect of how the numbers are represented. For 16-bit numbers, -x is equal to 65536-x rather than 32768+x (which would be the case if the upper bit were strictly the sign).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With