Get around a 404 with mechanize

Question

I'm creating a Python script that would read a file of URLs, but I know not all of them will work. I'm trying to figure out how to get around this and make it read the next line of the file, instead of raising the error that I have posted below. I know I need some kind of if statement but I can't quite figure it out.

from mechanize import Browser
from BeautifulSoup import BeautifulSoup
import csv

me = open('C:\Python27\myfile.csv')
reader = csv.reader(me)
mech = Browser()

for url in me:
    response =  mech.open(url)
    html = page.read()
    soup = BeautifulSoup(html)
    table = soup.find("table", border=3)

for row in table.findAll('tr')[2:]:
    col = row.findAll('td')
    BusinessName = col[0].string
    Phone = col[1].string
    Address = col[2].string
    City = col[3].string
    State = col[4].string
    Zip = col[5].string
    Restaurantinfo = (BusinessName, Phone, Address, City, State)
    print "|".join(Restaurantinfo)

When I run that block of code it raises this error:

httperror_seek_wrapper: HTTP Error 404: Not Found

Basically what I am asking for is how to make Python ignore that and try the next URL.

NicoFromFrance · Accepted Answer

if you only have url in your file maybe it would be more simple to write one url per line and use some code like this:

from mechanize import Browser
from BeautifulSoup import BeautifulSoup


me = open('C:\Python27\myfile.csv')
mech = Browser()

for url in me.readlines():
    ...

if you want to keep your code, you have to use :

for url in reader:
    ...

Get around a 404 with mechanize

Tags:

python

csv

beautifulsoup

mechanize

Trook2007

1 Answers

NicoFromFrance

Recent Activity

Donate For Us

Get around a 404 with mechanize

Tags:

python

csv

beautifulsoup

mechanize

Trook2007

1 Answers

NicoFromFrance

Related questions

Recent Activity

Donate For Us