Reading logs using regular expression

Question

I have a .txt file contains requests logs in the following format:

time_namelookup: 0,121668 
time_connect: 0,460643 
time_pretransfer: 0,460755 
time_redirect: 0,000000 
time_starttransfer: 0,811697 
time_total: 0,811813 
-------------
time_namelookup: 0,121665 
time_connect: 0,460643 
time_pretransfer: 0,460355 
time_redirect: 0,000000 
time_starttransfer: 0,813697 
time_total: 0,811853 
-------------
time_namelookup: 0,121558 
time_connect: 0,463243 
time_pretransfer: 0,460755 
time_redirect: 0,000000 
time_starttransfer: 0,911697 
time_total: 0,811413

I want to create a list of values for each category so I thought regular expression could relevant in that case.

import re

'''
In this exmaple, I save only the 'time_namelookup' parameter
The same logic adapted for other parameters.
'''

namelookup = []
with open('shaghai_if_config_test.txt', 'r') as fh:
     for line in fh.readlines():
         number_match = re.match('([+-]?([0-9]*[,])?[0-9]+)',line)
         namelookup_match = re.match('^time_namelookup:', line)
         if namelookup_match and number_match:
             num = number_match.group(0)
             namelookup.append(num)
             continue

I find this logic quite complicated as I have to execute two regex matches. Moreover, the number_match parameter doesn't match the line, while ^time_namelookup: ([+-]?([0-9]*[,])?[0-9]+) works fine

I looking for experienced advice for the described procedure. Any advice is appreciated.

Emma · Accepted Answer

My guess is that you have designed a fine expression, we would maybe slightly modify that to:

(time_(?:namelookup|connect|pretransfer|redirect|starttransfer|total))\s*:\s*([+-]?(?:\d*,)?\d+)

Test with `re.findall`:

import re

regex = r"(time_(?:namelookup|connect|pretransfer|redirect|starttransfer|total))\s*:\s*([+-]?(?:\d*,)?\d+)"

test_str = ("time_namelookup: 0,121668 
"
    "time_connect: 0,460643 
"
    "time_pretransfer: 0,460755 
"
    "time_redirect: 0,000000 
"
    "time_starttransfer: 0,811697 
"
    "time_total: 0,811813 
")

print(re.findall(regex, test_str))

Output

[('time_namelookup', '0,121668'), ('time_connect', '0,460643'), ('time_pretransfer', '0,460755'), ('time_redirect', '0,000000'), ('time_starttransfer', '0,811697'), ('time_total', '0,811813')]

Test with `re.finditer`:

import re

regex = r"(time_(?:namelookup|connect|pretransfer|redirect|starttransfer|total))\s*:\s*([+-]?(?:\d*,)?\d+)"

test_str = ("time_namelookup: 0,121668 
"
    "time_connect: 0,460643 
"
    "time_pretransfer: 0,460755 
"
    "time_redirect: 0,000000 
"
    "time_starttransfer: 0,811697 
"
    "time_total: 0,811813 
"
    "-------------
"
    "time_namelookup: 0,121665 
"
    "time_connect: 0,460643 
"
    "time_pretransfer: 0,460355 
"
    "time_redirect: 0,000000 
"
    "time_starttransfer: 0,813697 
"
    "time_total: 0,811853 
"
    "-------------
"
    "time_namelookup: 0,121558 
"
    "time_connect: 0,463243 
"
    "time_pretransfer: 0,460755 
"
    "time_redirect: 0,000000 
"
    "time_starttransfer: 0,911697 
"
    "time_total: 0,811413 ")

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

The expression is explained on the top right panel of this demo if you wish to explore/simplify/modify it.

RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

Reading logs using regular expression

Tags:

python

regex

logging

Daniel Chepenko

1 Answers

Test with `re.findall`:

Output

Test with `re.finditer`:

RegEx Circuit

Emma

Recent Activity

Donate For Us

Reading logs using regular expression

Tags:

python

regex

logging

Daniel Chepenko

1 Answers

Test with re.findall:

Output

Test with re.finditer:

RegEx Circuit

Emma

Related questions

Recent Activity

Donate For Us

Test with `re.findall`:

Test with `re.finditer`: