I am quite new to Python and I researched as much as I could before I decided I should ask a question here. So here is the problem:
I am not sure what I am doing wrong with my RegEx. I wanted to try the re.findall() out, so I thought I would write a small script that would find phone numbers on webpages. Here is the code that I have right now.
import re, urllib
inurl = raw_input("Input a URL: ")
web = urllib.urlopen(inurl)
web.readlines()
numbers = re.findall("/\d{3}.\d{3}.\d{4}/g", web)
for itm in numbers
print itm
Not sure what is happening. I keep getting the error of "expected string or buffer" for the line that has
numbers = re.findall(".....", web)
Thanks in advance.
/\d{3}.\d{3}.\d{4}/g - The /../ part is to identify regex in other languages like Ruby and g is a flag, also not applicable to Python. Try removing them and use just \d{3}.\d{3}.\d{4}
Also I think you wanted to use the output / response in the findall and not just web, which is why you are seeing the expected string or buffer. You should also remove the line that just does web.readlines()
So what you may want to do will be something like this:
numbers = re.findall("\d{3}.\d{3}.\d{4}", web.read())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With