Background
Sometimes you find yourself needing to work at the byte-level in an application you are working on. I feel that in Python there are not enough examples of how to do this. There is also a lot of potential to over-complicate the solution.
This Example
I plan to cover several aspects of working with bytes in this example. I’ll cover working with the struct package, the bytearray built-in and the ctypes module.
The Code
from ctypes import c_int, Structure from struct import pack_into, unpack_from # create 2 buffers, one smaller than the other for # demonstration purposes. buff1 = bytearray(64) buff2 = bytearray(32) # first we'll use the struct package to initalize the arrays. # initialize buff1 with 32 integers. pack_into('I' * 16, buff1, 0, *range(0, 16)) print('Buffer 1:', unpack_from('I' * 16, buff1, 0)) # for the sake of demonstration, we'll work with buff2. # copy part of buff1 into buff2, since we're using # bytearrays, this should be equivalent to a memcpy buff2[:] = buff1[:32] # test it out, did we copy 32 bytes from buff1 into buff2? print('Buffer 2:', unpack_from('I' * 8, buff2, 0), end='\n\n') # We can also use the ctypes package to access the buffers # We can access it piece-meal like this. Note that this # copies the buffer, if we didn't want a copy. from_buffer # is the function we would use. x = c_int.from_buffer_copy(buff2, 8) y = c_int.from_buffer_copy(buff2, 12) print(f'x, y as 2 standalone c_ints: {x.value}, {y.value}', end='\n\n') # You can also create C Structures to access the data # Define a simple ctypes structure. class Point(Structure): _fields_ = [ ('x', c_int), ('y', c_int) ] p1 = Point.from_buffer_copy(buff2, 8) print(f'x, y as elements of a ctype structure (copied): {p1.x}, {p1.y}') # note that since this is a copy any manipulation doesn't # effect the buffer. p1.x, p1.y = 50, 51 print(f'p1.x, p1.y set to: {p1.x}, {p1.y}') print('Show buff2 is unchanged:', unpack_from('I' * 8, buff2, 0), end='\n\n') # so, if we wanted to directly manipulate the buffer using the structure p2 = Point.from_buffer(buff2, 8) print(f'x, y as elements of a ctype structure (not copied): {p1.x}, {p1.y}') p2.x, p2.y = 100, 101 print(f'p2.x, p2.y set to: {p2.x}, {p2.y}') # see that the 3rd and 4th element now been changed. print('Show buff2 is changed:', unpack_from('I' * 8, buff2, 0), end='\n\n') # finally note that the original buffer is unchanged. print('Show buff1 unchanged:', unpack_from('I' * 16, buff1, 0))
Output
Buffer 1: (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15) Buffer 2: (0, 1, 2, 3, 4, 5, 6, 7) x, y as 2 standalone c_ints: 2, 3 x, y as elements of a ctype structure (copied): 2, 3 p1.x, p1.y set to: 50, 51 Show buff2 is unchanged: (0, 1, 2, 3, 4, 5, 6, 7) x, y as elements of a ctype structure (not copied): 50, 51 p2.x, p2.y set to: 100, 101 Show buff2 is changed: (0, 1, 100, 101, 4, 5, 6, 7) Show buff1 unchanged: (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
A statement on copying
Be careful with slicing a python bytearray, bytes or array.array. Slicing creates a copy and can impact the performance of your application. There is a better way; enter memoryview. Memoryview works with anything that implements the Python Buffer Protocol and makes slicing very efficient. Slicing a memoryview will result in another memoryview, not a copy of the bytes represented.