Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegEx pattern to parse working hours string

I am writing a python library to parse different working hours string and produce the standard format of hours. I stuck in the following case:

My regex should return groups for Mon - Fri 7am - 5pm Sat 9am - 3pm as ['Mon - Fri 7am - 5pm ', 'Sat 9am - 3pm'] but if there is a comma between first and second then it should return [].

Also the comma can be in anywhere but should not between the two weekdays & duration. eg: Mon - Fri 7am - 5pm Sat 9am - 3pm and available upon email, phone call should return ['Mon - Fri 7am - 5pm ', 'Sat 9am - 3pm'].

This is what I have tried,

import re
pattern = """(
    (?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|m|w|f|thurs) # Start weekday
\s*[-|to]+\s* # Seperator
(?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|^(?![ap])m|w|f|thurs)?  # End weekday
\s*[from]*\s* # Seperator
(?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?) # Start hour
\s*[-|to]+\s* # Seperator
(?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?) # Close hour
)"""

regEx = re.compile(pattern, re.IGNORECASE|re.VERBOSE)

print re.findall(regEx, "Mon - Fri 7am - 5pm Sat 9am - 3pm")
# output ['Mon - Fri 7am - 5pm ', 'Sat 9am - 3pm']
print re.findall(regEx, "Mon - Fri 7am - 5pm Sat - Sun 9am - 3pm")
# output ['Mon - Fri 7am - 5pm ', 'Sat - Sun 9am - 3pm']
print re.findall(regEx, "Mon - Fri 7am - 5pm, Sat 9am - 3pm")
# expected output []
# but I get ['Mon - Fri 7am - 5pm,', 'Sat 9am - 3pm']
print re.findall(regEx, "Mon - Fri 7am - 5pm , Sat 9am - 3pm")
# expected output []
# but I get ['Mon - Fri 7am - 5pm ', 'Sat 9am - 3pm']

Also I tried negative look ahead pattern in my regex

pattern = """(
(?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|m|w|f|thurs)
\s*[-|to]+\s*
(?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|^(?![ap])m|w|f|thurs)?
\s*[from]*\s*
(?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?)
\s*[-|to]+\s*
(?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?)
(?![^,])
)"""

But I didnt get expected one. Should I explicitly write code for checking condition? Is there any way to just changing my regex instead of writing explicit condition checking?

Another way I like to implement is infix the comma between two weekday- duration if comma doesn't exist and change my regex to group by/split by comma. "Mon - Fri 7am - 5pm Sat 9am - 3pm" => "Mon - Fri 7am - 5pm, Sat 9am - 3pm"

like image 802
Garfield Avatar asked Jun 07 '26 10:06

Garfield


1 Answers

I think that you can doing it simply by matching the whole expression so that comma (and other characters are not allowed :

pattern = """^(
(
    (?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|m|w|f|thurs) # Start weekday
\s*[-|to]+\s* # Seperator
(?:mon|tue|wed|thu|fri|sat|sun|mo|tu|we|th|fr|sa|su|^(?![ap])m|w|f|thurs)?  # End weekday
\s*[from]*\s* # Seperator
(?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?) # Start hour
\s*[-|to]+\s* # Seperator
(?:\d{1,2}(?:[:]\d{1,2})?)\s*(?:[ap][.]?m.?) # Close hour
)
)+$""

This will output :

[('Sat 9am - 3pm', 'Sat 9am - 3pm')]
[('Sat - Sun 9am - 3pm', 'Sat - Sun 9am - 3pm')]
[]
[]

Hope it helps,

like image 144
Y__ Avatar answered Jun 09 '26 00:06

Y__