I am quite new to python and I'm working on a task where I'm supposed to keep building on a regex and I have encountered a full stop.
For some reason when adding the latter parts some of the regex just breaks down and stops matching a few strings that were previously matched.
I am supposed to run the regex on a string that looks like such:
Sep 15 04:34:02 li146-252 sshd[12130]: Failed password for invalid user ronda from 212.58.111.170
The code:
#!/usr/bin/python
import re
with open('livehack.txt', 'r') as file:
for line in file:
dateString = re.findall('^(?:[A-z][a-z]{2}[ ][0-9]{1,2}[ ][\d]{2}[:][\d]{2}[:][\d]{2}) | li146-252 | ?:[0-9]{5} | Failed password for invalid', line)
print dateString
The result of the code:
['Sep 17 06:40:28 ', ' Failed password for invalid']
As you can see, there is a few things that should be caught that are missing, and I have no idea why.
Thanks in advance.
Regex expressions are always difficult to read. Try an online Regex tester. This will probably give you some more information about what is wrong and you can try different inputs and expressions. These are my favorites:
In your case I think you have added some extra space characters to the regex that should not be there. Space also counts as a character that needs to match.
I would also add parentheses around the expressions that are separated with |. Sometimes it is hard to know what parts are used when inserting a | character.
Like this:
'(?:^(?:[A-z][a-z]{2}[ ][0-9]{1,2}[ ][\d]{2}[:][\d]{2}[:][\d]{2}))|(?:li146-252)|(?:[0-9]{5})|(?:Failed password for invalid)'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With