Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the redirected URL in Web Scraping?

All I want is the redirected url after requesting the actual url. This is the actual url https://metric.picodi.net/us/r/19761, when I hit enter on browser using this url it redirects me to a url like this

https://www.overstock.com/?AID=11557584&PID=9096967&SID=5e479aea42dd4d2c85183aa2&cjevent=2e4090483d7d3c3db27e63d14903c327c7718b978cf0dfa24&entrytrigger=noshow&exittrigger=noshow&fp=F&utm_source=cj&utm_medium=affiliates

I have tried to implement it like this but it s giving me the same url

>>> import requests
>>> r = requests.get('https://metric.picodi.net/us/r/19761', allow_redirects=True)
>>> print(r.url)
https://metric.picodi.net/us/r/19761
>>> r.history
[]

I have also tried the following -

>>> r = requests.head('https://metric.picodi.net/us/r/19761', allow_redirects=True)
>>> print(r.url)
https://metric.picodi.net/us/r/19761
>>> r.history
[]
like image 498
The Sorcerer Avatar asked Sep 15 '25 19:09

The Sorcerer


1 Answers

That's due to JavaScript which is handling the redirection dynamically after page loads.

Therefore, you can achieve that using Selenium

Something like the following:

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.add_argument('--headless')
driver = webdriver.Firefox(options=options)
link = 'https://metric.picodi.net/us/r/19761'

driver.get(link)
print(driver.current_url)

driver.quit()

Output:

https://www.overstock.com/?AID=11557584&PID=9096967&SID=5e63c10642dd4d26f7549875&cjevent=121071440d708c3db27e63d55903c327c7718b9633548769c&entrytrigger=noshow&exittrigger=noshow&fp=F&utm_source=cj&utm_medium=affiliates

Note that you might use requests_html which will be a good friend to render the JavaScript for you.

like image 151
αԋɱҽԃ αмєяιcαη Avatar answered Sep 17 '25 10:09

αԋɱҽԃ αмєяιcαη