Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

URLRetrieve Error Handling

I have the following code that grabs images using urlretrieve working..... too a point.

def Opt3():
    global conn
    curs = conn.cursor()
    results = curs.execute("SELECT stock_code FROM COMPANY")

    for row in results:
    #for image_name in list_of_image_names:
        page = requests.get('url?prodid=' +     row[0])
        tree = html.fromstring(page.text)

        pic = tree.xpath('//*[@id="bigImg0"]')

        #print pic[0].attrib['src']
        print 'URL'+pic[0].attrib['src']
        try:
            urllib.urlretrieve('URL'+pic[0].attrib['src'],'images\\'+row[0]+'.jpg')
        except:
            pass

I am reading a CSV to input the image names. It works except when it hits an error/corrupt url (where there is no image I think). I was wondering if I could simply skip any corrupt urls and get the code to continue grabbing images? Thanks

like image 708
user3450524 Avatar asked Dec 06 '25 04:12

user3450524


2 Answers

urllib has a very bad support for error catching. urllib2 is a much better choice. The urlretrieve equivalent in urllib2 is:

resp = urllib2.urlopen(im_url)
with open(sav_name, 'wb') as f:
  f.write(resp.read())

And the errors to catch are:

urllib2.URLError, urllib2.HTTPError, httplib.HTTPException

And you can also catch socket.error in case that the network is down. Simply using except Exception is a very stupid idea. It'll catch every error in the above block even your typos.

like image 85
Linjie Avatar answered Dec 07 '25 16:12

Linjie


Just use a try/except and continue if it fails

try:
    page = requests.get('url?prodid=' +     row[0])
except Exception,e:
    print e
    continue # continue to next row
like image 33
Padraic Cunningham Avatar answered Dec 07 '25 18:12

Padraic Cunningham



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!