Skip to content
 

Endianness: Big Endian vs. Little Endian; Shifting and Casting Examples

Background

I ran into a bug recently where I was trying to extract the lower 32 bits from a 64 bit pointer referencing two contiguous 32 bit variables in memory.   They were taken out opposite what was expected due to my confusion with the way data is stored.  There is a fairly good writeup on Wikipedia but not very many examples.  I also highly recommend reading this article about writing Endian-independent-code.

In general endianness only has to do with the byte order of multi-byte CPU built in types ( int16, int32, int64, etc).  The bits within each byte are generally treated the same by most system.  Bit endianness is much rarer.

Endian Detection

From what I understand it is safest to detect endianness via a union instead of a cast since the latter may not be supported in all implementations. Here is an example of endian detection:

int IsLittleEndian()
{
  union
  {
    uint32_t i;
    uint8_t c[4];
  } endianCheck = {0x04030201};

  return endianCheck.c[0] == 1;
}

To explain, there is one 32 bit memory location that can be accessed as a 32 bit variable or an array of 4 characters.  If the memory address for the 32 bit integer (4 bytes) is X, the first char array byte c[0] is at X as well, with c[1] being at memory location X+1 byte, c[2] at X+2 bytes, and c[3] being at X+3.

The secret is the bytes are reversed depending on the endianness.  From the detection function, 0x04030201 looks like this:

 
Memory Location Little Endian Big Endian
X 0x01 0x04
X+1 0x02 0x03
X+2 0x03 0x02
X+3 0x04 0x01

From the table you can tell that the byte order is reversed and that for code to be portable you would have to take into account the individual byte positions if portions of a multi-byte primitive are to be modified.

Effect on Shifting and Casting

Explicit shifting and casting to convert data is one of my favorites even though casting mistakes can be dangerous.  This requires a different version depending on the endianness.  Given the above 32 bits in the detection function set to 0x04030201:

int var = 0x04030201;
//extract the most significant byte (0x04) from 0x04030201
//little endian:
uint8_t extractedByte = (uint8_t)(var >> 24);
//big endian:
uint8_t extractedByte = (uint8_t)var;

//extract least significant byte (0x01)
//little endian
uint8_t extractedByte = (uint8_t) var;
//big endian
uint8_t extractedByte = (uint8_t) ( var << 24 );

//extract extract 2nd least significant byte (0x02)
//little endian:
uint8_t extractedByte = (uint8_t)(var >> 8);
//big endian:
uint8_t extractedByte = (uint8_t)( var << 16 );

The cast forces a single byte to be read at the location of the 32 bit variable. From the table above you can tell how that can be different depending on the endianness. A brushup on bit shifting is located here.

Masking is not affected because the mask will be represented in the same byte order.

Conclusion

Endianness, although it can be confusing, is a simple concept that is the basis of computer memory.  It helps to understand the basics and to confirm code behavior with a good debugger.

One Comment

  1. Jay says:

    You should note that casting always gives the least significant bytes.

    Thus to get least significant byte from
    int var = 0x04030201;

    you can portably use
    uint8_t extractedByte = (uint8_t)var;

    In other words:
    On big endian
    (uint8_t)var;
    is equivalent with
    (uint8_t) ( var << 24 );

    Also:
    see here:
    http://stackoverflow.com/questions/2247736/little-endian-vs-big-endian

Leave a Reply