After reading a pretty good article on regex optimization in java I was wondering what are the other good tips for creating fast and efficient regular expressions?
(?:pattern)
when you need to repeat a grouping but don't need to use the captured value that comes from a traditional (capturing)
group.(?>pattern)
.I created a video demonstrating these techniques. I started with the very poorly written regular expression in the catastrophic backtracking article (x+x+)+y
. And then I made it 3 million times faster after a series of optimizations, benchmarking after every change. The video is specific to .NET but many of these things apply to most other regex flavors as well:
.NET Regex Lesson: #5: Optimization
Use the any (dot) operator sparingly, if you can do it any other way, do it, dot will always bite you...
i'm not sure whether PCRE is NFA and i'm only familiar with PCRE but + and * are usually greedy by default, they will match as much as possible to turn this around use +? and *? to match the least possible, bear these two clauses in mind while writing your regexp.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With