Programmer's Python Data - Byte Manipulation

Written by Mike James

Monday, 05 June 2023

Article Index
Programmer's Python Data - Byte Manipulation
Multibyte Shifts

Page 1 of 2

Bytes are at the most primitive of data type and hence universal but can you manipulate them? Find out how it all works in this extract from my new book Programmer's Python: Everything is Data.

Programmer's Python
Everything is Data

Is now available as a print book: Amazon

Python – A Lightning Tour
The Basic Data Type – Numbers
Extract: Bignum
Truthy & Falsey
Dates & Times
Extract Naive Dates
Sequences, Lists & Tuples
Extract Sequences
Strings
Extract Unicode Strings
Regular Expressions
Extract Simple Regular Expressions
The Dictionary
Extract The Dictionary
Iterables, Sets & Generators
Extract Iterables
Comprehensions
Extract Comprehensions
Data Structures & Collections
Extract Stacks, Queues and Deques
Extract Named Tuples and Counters
Bits & Bit Manipulation
Extract Bits and BigNum
Extract Bit Masks ***NEW!!!
Bytes
Extract Bytes And Strings
Extract Byte Manipulation
Binary Files
Extract Files and Paths
Text Files
Extract Text Files & CSV
Creating Custom Data Classes
Extract A Custom Data Class
Python and Native Code
Extract Native Code
Appendix I Python in Visual Studio Code
Appendix II C Programming Using Visual Studio Code

In chapter but not in this extract

Bytes
Bytes and Bytearray
Bytes As Strings
Decode Encode

Byte Manipulation

The need to perform bit manipulation on multiple bytes is a common requirement. There are two ways to approach this problem. We could convert the bytes to a single bignum representation, perform the bitwise operation and then convert back. Alternatively we could process the sequence directly, using for loops, to produce a new sequence.

If you want to convert a byte sequence to a bignum you can use the from_bytes class method:

int.from_bytes(bytes, byteorder =, signed = False)

where bytes is a bytes or bytearray object and byteorder determines the order in which the bytes are to be used to create the integer and can be set to big or little.

This matter of order is something we have been able to ignore up to this point, but no longer. The problem is, where is the most significant byte – at the start of the sequence or at the end? This is the well known “endian” problem and it is a fundamental choice in computer architecture. Bytes, or groupings of bytes, are generally stored in a single memory location, but to make use of them you generally have to assemble them into a single bit pattern and there are two ways of doing this – big first or little first. For example, consider:

myBytes=bytes([0xAA,0x55])

as a possible representation of a two-byte integer. Our two choices are to take the first element as the most significant byte:

myBytes[0]+myBytes[1] = 0xAA55

this is big endian or we could take the last element as the most significant byte:

myBytes[1]+myBytes[0] = 0x55AA

which is little endian. You can see that the selection of big or little endian produces two very different integer values and two very different bit patterns.

The endian problem occurs whenever you have to put a sequence of bytes, or other discrete bit patterns, together to form a larger bit pattern. For example:

myBytes = bytes([0xFF,0xAA,0x55])
bits = int.from_bytes(myBytes,byteorder = 'big')
print(hex(bits))

displays:

0xffaa55

and changing to byteorder = ’little’ displays:

0x55aaff

If you want to use the byte order that the current machine uses for its memory access then specify byteorder = sys.byteorder

To convert the bignum back to a bytes object you can use the to_bytes int method:

to_bytes(length ,byteorder =,signed = False)

again you have to specify the byteorder and the number of elements in the bytearray. For example:

myBytes=bits.to_bytes(3,byteorder='big')
print(myBytes)

displays:

b'\xff\xaaU'

The need to specify the number of elements in the array is irritating because if you get it wrong and the integer cannot be represented in the number of elements it generates an exception. To generate as many elements as needed you can use the int method bit_length that returns the number of bits stored in the bignum. To convert this into the number of bytes needed to accommodate this number of bits we can use:

(bit_length()+7)//8

Using this we can rewrite the previous example as:

myBytes = bits.to_bytes((bits.bit_length()+7)//8,
                                  byteorder = 'big')

Finally we have to deal with the problem of negative values. In most cases you can ignore this because you are only interested in working with bit patterns and, in general, bit patterns are usually extended using zero bits. The only time this is not the case is if the bit pattern really is an integer value in two’s complement form.

When converting from bytes to bignums, setting the signed parameter to True has the same effect as putting a minus sign in front of the value, i.e. it sets the sign bit to 1. As a side effect it will also appear to remove any leading ones from the value as these are treated as negative sign bits. For example:

myBytes=bytes([0xFF,0xAA,0x55])
bits=int.from_bytes(myBytes,byteorder='big',signed=True)
print(hex(bits))

displays:

-0x55ab

which, in two's complement, is equivalent to:

FFFF AA55

with as many leading ones as required by the operation. Notice that the bit pattern isn’t actually changed when stored in the bignum, it simply sets the sign bit.

Going the other way, from an integer to a bytes object works in much the same way, but if you try to convert a negative integer without signed = True an exception occurs because negative integers have to be treated as two's complement. For example:

bits=-1
myBytes=bits.to_bytes(1,byteorder='big',signed=True)
print(myBytes.hex())

displays ff as -1 is ff in two's complement.

In most cases when doing byte manipulation you can ignore problems with negative numbers because you can treat everything as positive integers.

Prev - Next >>

Last Updated ( Monday, 05 June 2023 )

Recent Articles

Recent Book Reviews

Popular Articles

Programmer's Python
Everything is Data

Is now available as a print book: Amazon

Contents

In chapter but not in this extract

Byte Manipulation

Recent Articles

Recent Book Reviews

Popular Articles

Programmer's PythonEverything is Data

Is now available as a print book: Amazon

Contents

In chapter but not in this extract

Byte Manipulation

Programmer's Python
Everything is Data