Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matching Multiple Alterations with Regex

Regex will be the death of me. I am parsing logs from an enterprise password manager. This is what a handful of the logs look like:

date_time=2017-01-27 23:17:39 user=John Doe (86) ip_address=10.10.44.131 origin=web action=export password=CSDEV - SQL Account #20 (496) project=Applications (2)
date_time=2017-01-30 18:21:49 user=John Doe (86) ip_address=10.10.44.131 origin=web action=view_passwords_list additional=Active Passwords
date_time=2017-01-27 23:29:06 user=John Doe (86) ip_address=10.10.44.131 origin=web action=add_password password=Non-ACS Devices (1099) project=Infrastructure & Operations (31) additional=Import

Every single line in the log starts with five tags: date_time, user, ip_address, origin, and action. Afterwards, though, there can be up to three additional tags: "password", "project", and "additional".

These extra tags are what are doing me in. I need to be able to capture all that are available. Right now I have:

date_time=(.+) user=(.+) ip_address=(.+) origin=(.+) action=(.+) (password=(.+)|project=(.+)|additional=(.+))+

Based on regex101 this is close but doesn't quite get there.

https://regex101.com/r/eA2eE1/4

My guess is the final leap has to do with greedy vs lazy but I've hit the ends of my regex knowledge for the moment.

Thanks for any help you can provide!

like image 228
Tchotchke Avatar asked Jan 31 '26 14:01

Tchotchke


1 Answers

You may use

^date_time=([\d-]+ [\d:]+) user=(.+?) ip_address=([\d.]+) origin=(.+?) action=(.+?)(?: password=((?:(?!\w+=).)*))?(?: project=((?:(?!\w+=).)*))?(?: additional=(.+?))?$

See the regex demo.

Details:

  • ^ - start of string
  • date_time= - literal char sequence
  • ([\d-]+ [\d:]+) - Group 1: one or more digits or -, space, and 1+ digits or :
  • user= - literal char sequence
  • (.+?) - Group 2: any 1+ chars as few as possible
  • ip_address= - literal char sequence
  • ([\d.]+) - Group 3: one or more digits or .
  • origin= - literal char sequence
  • (.+?) - Group 4: any 1+ chars as few as possible
  • action= - literal char sequence
  • (.+?) - Group 5: any 1+ chars as few as possible
  • (?: password=((?:(?!\w+=).)*))? - an optional group matching a sequence of:
    • password= - literal char sequence
    • ((?:(?!\w+=).)*) - a tempered greedy token matching 0 or more occurrences of any char that is not a starting sequence for 1+ word chars followed with =
  • (?: project=((?:(?!\w+=).)*))? - similar to above
  • (?: additional=(.+?))? - similar to above, the tempered greedy token is replaced with .+? to match any 1+ chars, as few as possible
  • $ - end of string.
like image 195
Wiktor Stribiżew Avatar answered Feb 03 '26 05:02

Wiktor Stribiżew