I am using the following regex to obtain all data from a website Javascript data source that is contained within the following character pattern
[[]]);
The code I am using is this:
regex = r'\[\[.*?\]]);'
match2 = re.findall(regex, response.body, re.S)
print match2
This is throwing up an error message of:
raise error, v # invalid expression
sre_constants.error: unbalanced parenthesis
I think I am fairly safe in assuming that this is being caused by the closing bracket within my regex. How can I define the regex that I want without getting this error?
Thanks
You need to escape those last parenthesis as well. Close square brackets outside a character class do not have to be escaped:
regex = r'\[\[.*?]]\);'
^
If you are trying to obtain the content between the square brackets, use a capturing group here.
>>> import re
>>> s = 'foo [[bar]]); baz [[quz]]); not [[foobar]]'
>>> matches = re.findall(r'\[\[(.*?)]]\);', s, re.S)
>>> matches
['bar', 'quz']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With