What is the most efficient way to repeatedly use regular expressions in a Python loop?

Question

When you're iterating over hundreds of lines in a file, what is the most (and least) efficient way to run regular expressions in Python?

Specifically, is the following bad form?

for line in file:
  data = re.search('(\d+\.\d+)\|(-\d+\.\d+)\|(.*?)\|(.*?)\|(\d+:\d+\s+\w+)\sTO\s(.*?)',line)
  one = data.group(1)
  two = data.group(2)
  three = data.group(3)
  four = data.group(4)
  five = data.group(5)
  six = data.group(6)
  # do the magic...

JAB · Accepted Answer

If you're just using this same regex over and over again, you don't need to compile it directly. http://docs.python.org/release/2.6.5/library/re.html#re.compile

The compiled versions of the most recent patterns passed to re.match(), re.search() or re.compile() are cached, so programs that use only a few regular expressions at a time needn’t worry about compiling regular expressions.

However, I would very much recommend not doing the assignments below that as you are. Try something like this:

for line in file:
    data = re.search('(\d+\.\d+)\|(-\d+\.\d+)\|(.*?)\|(.*?)\|(\d+:\d+\s+\w+)\sTO\s(.*?)',line)
    groups = data.groups()
    # do the magic...

MatchObject.groups() returns a tuple of all the groups in the match, with groups that don't participate in the match being assigned the value passed to groups() (said value defaults to None).

robert · Answer

Save the regex before the loop.

rx = re.compile( '(\d+\.\d+)\|(-\d+\.\d+)\|(.*?)\|(.*?)\|(\d+:\d+\s+\w+)\sTO\s(.*?)' )
for line in file:
  data = re.search(rx,line)
  one = data.group(1)
  two = data.group(2)
  three = data.group(3)
  four = data.group(4)
  five = data.group(5)
  six = data.group(6)

What is the most efficient way to repeatedly use regular expressions in a Python loop?

Tags:

python

regex

Greg

2 Answers

JAB

robert

Recent Activity

Donate For Us

What is the most efficient way to repeatedly use regular expressions in a Python loop?

Tags:

python

regex

Greg

2 Answers

JAB

robert

Related questions

Recent Activity

Donate For Us