Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegEx in python, not sure what I am doing wrong

I am quite new to Python and I researched as much as I could before I decided I should ask a question here. So here is the problem:

I am not sure what I am doing wrong with my RegEx. I wanted to try the re.findall() out, so I thought I would write a small script that would find phone numbers on webpages. Here is the code that I have right now.

    import re, urllib
    inurl = raw_input("Input a URL: ")
    web = urllib.urlopen(inurl)
    web.readlines()

    numbers = re.findall("/\d{3}.\d{3}.\d{4}/g", web)
    for itm in numbers
        print itm

Not sure what is happening. I keep getting the error of "expected string or buffer" for the line that has

    numbers = re.findall(".....", web)

Thanks in advance.

like image 574
inoobdotcom Avatar asked Feb 01 '26 20:02

inoobdotcom


1 Answers

/\d{3}.\d{3}.\d{4}/g - The /../ part is to identify regex in other languages like Ruby and g is a flag, also not applicable to Python. Try removing them and use just \d{3}.\d{3}.\d{4}

Also I think you wanted to use the output / response in the findall and not just web, which is why you are seeing the expected string or buffer. You should also remove the line that just does web.readlines()

So what you may want to do will be something like this:

numbers = re.findall("\d{3}.\d{3}.\d{4}", web.read())
like image 114
manojlds Avatar answered Feb 04 '26 08:02

manojlds