Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading binary file to find a sequences of ints little endian (permutations)

Tags:

python

Try to read a binary file (firmware) with a sequences like

\x01\x00\x00\x00\x03\x00\x00\x00\x02\x00\x00\x00\x04\x00\x00\x00

Little endian integer 1,3,2,4

Attempt:

with open("firm.bin", 'rb') as f:
  s = f.read()


N = 16

allowed = set(range(4))
for val in allowed:
    val = bytes(val)+b'\x00\x00\x00'
for index, b in enumerate(s):
    print(b)
    i = b.hex()
    b= b'\x00\x00\x00'+bytes(bytes.fromhex(f'{i:x}'))
    if b in allowed and set(s[index:index + N]) == allowed:
        print(f'Found sequence {s[index:index + N]} at offset {index}')

Above does not seem to work with error:

ValueError: Unknown format code 'x' for object of type 'str'

Why?

Problem I am trying to solve:

How can I find in binary file sequences like this being 16 ints little endian with values from 0 to 15 i.e

[0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15,0-15]

Update 1:

Tried proposed answer, but no results, where it should:

import numpy as np
import sys


# Synthesize firmware with 100 integers, all equal to 1
#firmware = np.full(100, 1, dtype=np.uint32)
#firmware = np.fromfile('firm.ori', dtype='uint32')
a1D = np.array([1, 2, 3, 4, 6, 5, 7, 8, 10, 9, 11, 13, 12, 14, 15, 0],dtype='uint32')
print(a1D)

r = np.convolve(a1D, [1]*16, mode='same')[8:-8]
np.set_printoptions(threshold=sys.maxsize)
print(r)
r = np.where(r < (16*15))
print(r)
print(a1D[r])

Ideally it should say offset 0, but values would be also fine i.e to print

[ 1  2  3  4  6  5  7  8 10  9 11 13 12 14 15  0]

Now it outputs:

[ 1  2  3  4  6  5  7  8 10  9 11 13 12 14 15  0]
[]
(array([], dtype=int64),)
[]
like image 751
dev Avatar asked Nov 03 '25 07:11

dev


1 Answers

You refer to the values in the firmware as 32-bit integers so I've assumed that the file can be converted to integers. I've used the Python struct lib to do this.

I've also understood that you want to find a sequence of 16 unique integers in the range 0 to 15.

My test below iterated over the integers in the firmware file, looking ahead each time and converting that list of 16 integers to a set to check the length was still 16. I then iterated over the set to check all values where below 16.

Here is my test I did:

from secrets import token_bytes
import struct

# Create test data
firmware_ints = 200_000
int_len = 4
data = token_bytes(firmware_ints * int_len)
to_find = struct.pack('<16L', *range(16))
print(f"To find [{len(to_find)}]: {to_find}\n")
hide_idx = 20 * int_len * -1  # find 20 ints from the end
data = b''.join([data[:hide_idx], to_find, data[hide_idx:]])
# End of creating test data

search_max = 16
search_len = 16

# Convert firmware to integers
words = [x[0] for x in struct.iter_unpack('<L', data)]
# Iterate through to find sequence
for idx in range(len(words) - search_len):
    this_seq = words[idx:idx + search_len]
    if len(set(this_seq)) == search_len:
        if all([x < search_max for x in this_seq]):
            print(f'Found sequence {this_seq} at offset {idx}')

which gave the output of:

Hidden bytes [64]: b'\x00\x00\x00\x00\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c\x00\x00\x00\r\x00\x00\x00\x0e\x00\x00\x00\x0f\x00\x00\x00'

Found sequence [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] at offset 199980
like image 85
ukBaz Avatar answered Nov 04 '25 21:11

ukBaz