I'm creating a Python script that would read a file of URLs, but I know not all of them will work. I'm trying to figure out how to get around this and make it read the next line of the file, instead of raising the error that I have posted below. I know I need some kind of if statement but I can't quite figure it out.
from mechanize import Browser
from BeautifulSoup import BeautifulSoup
import csv
me = open('C:\Python27\myfile.csv')
reader = csv.reader(me)
mech = Browser()
for url in me:
response = mech.open(url)
html = page.read()
soup = BeautifulSoup(html)
table = soup.find("table", border=3)
for row in table.findAll('tr')[2:]:
col = row.findAll('td')
BusinessName = col[0].string
Phone = col[1].string
Address = col[2].string
City = col[3].string
State = col[4].string
Zip = col[5].string
Restaurantinfo = (BusinessName, Phone, Address, City, State)
print "|".join(Restaurantinfo)
When I run that block of code it raises this error:
httperror_seek_wrapper: HTTP Error 404: Not Found
Basically what I am asking for is how to make Python ignore that and try the next URL.
if you only have url in your file maybe it would be more simple to write one url per line and use some code like this:
from mechanize import Browser
from BeautifulSoup import BeautifulSoup
me = open('C:\Python27\myfile.csv')
mech = Browser()
for url in me.readlines():
...
if you want to keep your code, you have to use :
for url in reader:
...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With