Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to capture all characters in binary string without python interpreting it

Here is how I reproduce the problem:

Create a log file called 'temp.log' and paste this line into it

DEBUG: packetReceived '\x61\x62\x63'

I want to have a script which will read the line from the log file and decode the binary string part ('\x61\x62\x63'). For the decoding, I am using struct, so:

struct.unpack('BBB', '\x61\x62\x63')

Should give me

(97, 98, 99)

Here is the script which I am using

import re
import struct
import sys

f = open(sys.argv[1], 'r')
for line in f:
    print line
    packet = re.compile(r"packetReceived \'(.*)\'").search(line).group(1)

    # packet is the string r'\x61\x62\x63'
    assert(len(packet), 12)

    # this works ok (returns (97, 98, 99))
    struct.unpack('BBB', '\x61\x62\x63')

    # this fails because packet is interpreted as r'\\x61\\x62\x63'
    struct.unpack('BBB', packet)

I run the script using temp.log as the argument to the script.

Hopefully the comments highlight my problem. How can I get the variable packet to be interpreted as '\x61\x62\x63' ??

ASIDE: On the first edit of this question, I assumed that reading the line from the file was the same as this: line = "DEBUG: packetReceived '\x61\x62\x63'" which made packet == 'abc'

however it is actually the same as this (using rawstring) line = r"DEBUG: packetReceived '\x61\x62\x63'"

like image 892
Philip Clarke Avatar asked Feb 01 '26 12:02

Philip Clarke


1 Answers

Python doesn't interpret strings that you pass to regular expressions. The escape sequences were most likely interpreted earlier, when you defined variable line. This works correctly for example:

line = r"DEBUG: packetReceived '\x61\x62\x63'"
print re.compile(r"packetReceived '(.*)'").search(line).group(1)

It prints \x61\x62\x63.