Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignore later patterns in python regex if previous patterns are not found

How do I match a series of groups but later groups should only match if all previous groups matched?

Example:

Match any string up to "abc" with each character in its own group.

Valid: a, ab, abc Invalid: b, c, bc, ac

The following works, but I'm not sure if there's a better way:

^(a)?(?(1)(b)?)(?(2)(c)?)$

That says that each match is optional, but depends on the group before it matching. That is, 'c' cannot match unless 'b' matches which cannot happen unless 'a' matches.

To help anyone googling this later, I'm parsing a Dicom DateTime which has the following format.

 YYYYMMDDHHMMSS.FFFFFF&ZZXX     # &ZZXX is an optional timezone offset

I used code to build this regex rather than just typing it in like this.

dicom_dt_parser = re.compile(
    r'^' +
    r'(?P<year>\d{4})' +
    r'(?(year)(?P<month>\d{2})?)' +
    r'(?(month)(?P<day>\d{2})?)' +
    r'(?(day)(?P<hour>\d{2})?)' +
    r'(?(hour)(?P<min>\d{2})?)' +
    r'(?(min)(?P<sec>\d{2})?)' +
    r'(?(sec)(?P<frac>\.\d{1,6})?)' +
    r'(?P<tz>[\+\-]\d{4})?' +
    r'$'
)

dicom_dt_parser.match(datetime_string).groupdict() will return a dictionary with all of the fields. Missing fields will have values of None.

like image 574
Harvey Avatar asked Dec 06 '25 05:12

Harvey


1 Answers

What you are doing is perfectly fine, readable and straightforward. This is another shorter way to build your regex as well, using nested groups:

^a(b(c)?)?$

Live demo

If you are going to accept empty input strings you may want to append |^$ to above regex.

Regex for abcdef would be:

^a(b(c(d(e(f)?)?)?)?)?$

Your regex built with this work around:

^(?P<year>\d{4})(?:(?P<month>\d{2})(?:(?P<day>\d{2})(?:(?P<hour>\d{2})(?:(?P<min>\d{2})(?:(?P<sec>\d{2})(?:(?P<frac>\.\d{1,6})(?P<tz>[+-]\d{4})?)?)?)?)?)?)?$

Your own regex:

^(?P<year>\d{4})(?(year)(?P<month>\d{2})?)(?(month)(?P<day>\d{2})?)(?(day)(?P<hour>\d{2})?)(?(hour)(?P<min>\d{2})?)(?(min)(?P<sec>\d{2})?)(?(sec)(?P<frac>\.\d{1,6})?)(?P<tz>[\+\-]\d{4})?$
like image 189
revo Avatar answered Dec 08 '25 21:12

revo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!