Is there a way to back reference in the regular expression pattern?
Example input string:
Here is "some quoted" text.
Say I want to pull out the quoted text, I could create the following expression:
"([^"]+)"
This regular expression would match some quoted.
Say I want it to also support single quotes, I could change the expression to:
["']([^"']+)["']
But what if the input string has a mixture of quotes say Here is 'some quoted" text. I would not want the regex to match. Currently the regex in the second example would still match.
What I would like to be able to do is if the first quote is a double quote then the closing quote must be a double. And if the start quote is single quote then the closing quote must be single.
Can I use a back reference to achieve this?
My other related question: Getting text between quotes using regular expression
back-references are regular expression commands which refer to a previous part of the matched regular expression. Back-references are specified with backslash and a single digit (e.g. ' \1 '). The part of the regular expression they refer to is called a subexpression, and is designated with parentheses.
To insert a backslash into your regular expression pattern, use a double backslash ('\\'). The open parenthesis indicates a "subexpression", discussed below. The close parenthesis character terminates such a subexpression. Zero or more of the character or expression to the left.
Backtracking occurs when a regular expression pattern contains optional quantifiers or alternation constructs, and the regular expression engine returns to a previous saved state to continue its search for a match.
You can make use of the regex:
(["'])[^"']+\1
() : used for grouping[..] : is the char class. so ["']
matches either " or ' equivalent
to "|'
[^..] : char class with negation.
It matches any char not listed after
the ^
+ : quantifier for one or more\1 : backreferencing the first
group which is (["'])
In PHP you'd use this as:
preg_match('#(["\'])[^"\']+\1#',$str)
preg_match('/(["\'])([^"\']+)\1/', 'Here is \'quoted text" some quoted text.');
Explanation: (["'])([^"']+)\1/ I placed the first quote in parentheses. Because this is the first grouping, it's back reference number is 1. Then, where the closing quote would be, I placed \1 which means whichever character was matched in group 1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With