Anyway to scrape a link that redirects?

Question

Is there anyway that I can make python click a link such as a bit.ly link and then scrape the resulting link? When I am scraping a certain page, the only link I can scrape is a link that redirects, where it redirects to is where the information I need is located.

furas · Accepted Answer

There are 3 types of redirections

HTTP - as information in response headers (with code 301, 302, 3xx)
HTML - as tag <meta> in HTML (wikipedia: Meta refresh)
JavaScript - as code like window.location = new_url

requests execute HTTP redirections and keep all urls in r.history

import requests

r = requests.get('http://' + 'bit.ly/english-4-it')

print(r.history)
print(r.url)

result:

[<Response [301]>, <Response [301]>]
http://helion.pl/ksiazki/english-4-it-praktyczny-kurs-jezyka-angielskiego-dla-specjalistow-it-i-nie-tylko-beata-blaszczyk,anginf.htm

BTW: SO doesn't let put bitly link in text so I used concatenation.

Anyway to scrape a link that redirects?

Tags:

python

parsing

beautifulsoup

web-scraping

lxml

ColeWorld

1 Answers

furas

Recent Activity

Donate For Us

Anyway to scrape a link that redirects?

Tags:

python

parsing

beautifulsoup

web-scraping

lxml

ColeWorld

1 Answers

furas

Related questions

Recent Activity

Donate For Us