Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex Performance Optimization Tips and Tricks [closed]

After reading a pretty good article on regex optimization in java I was wondering what are the other good tips for creating fast and efficient regular expressions?


2 Answers

  1. Use the non-capturing group (?:pattern) when you need to repeat a grouping but don't need to use the captured value that comes from a traditional (capturing) group.
  2. Use the atomic group (or non-backtracking subexpression) when applicable (?>pattern).
  3. Avoid catastrophic backtracking like the plague by designing your regular expressions to terminate early for non-matches.

I created a video demonstrating these techniques. I started with the very poorly written regular expression in the catastrophic backtracking article (x+x+)+y. And then I made it 3 million times faster after a series of optimizations, benchmarking after every change. The video is specific to .NET but many of these things apply to most other regex flavors as well:

.NET Regex Lesson: #5: Optimization

like image 138
Steve Wortham Avatar answered Sep 10 '25 09:09

Steve Wortham


Use the any (dot) operator sparingly, if you can do it any other way, do it, dot will always bite you...

i'm not sure whether PCRE is NFA and i'm only familiar with PCRE but + and * are usually greedy by default, they will match as much as possible to turn this around use +? and *? to match the least possible, bear these two clauses in mind while writing your regexp.

like image 38
Question Mark Avatar answered Sep 10 '25 09:09

Question Mark