Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't bypass cloudflare with python cloudscraper

I faced with cloudflare issue when I tried to parse the website.

I got this code

import cloudscraper

url = "https://author.today"
scraper = cloudscraper.create_scraper()
print(scraper.post(url).status_code)

This code prints me

cloudscraper.exceptions.CloudflareChallengeError: Detected a Cloudflare version 2 challenge, This feature is not available in the opensource (free) version.

I searched for workaround, but couldn't find any solution. If visit the website via a browser you could see

Checking your browser before accessing author.today.

Is there any solution to bypass cloudflare in my case?

like image 499
Nickolas Avatar asked Nov 24 '25 12:11

Nickolas


2 Answers

Install httpx

pip3 install httpx[http2]

Define http2 client

client = httpx.Client(http2=True)

Make request

response = client.get("https://author.today")

Cheers!

like image 171
Zorome Avatar answered Nov 26 '25 09:11

Zorome


I can suggest such workflow to "try" to avoid Cloudflare WAF/bot mitigation:

  • don't cycle user agents, proxies or weird tunnels to surf
  • don't use fixed ip addresses, better leased lines like xDSL, home links and 4G/LTE
  • try to appear as mobile instead of a desktop/tablet
  • try to reproduce pointer movements like never before AKA record your mouse moves and migrate them 1:1 while scraping (yes u need JS enabled and some headless browser able to make up as "common" one)
  • don't cycle against different Cloudflare protected entities otherwise the attacker ip will be greylisted in a minute (AKA build your own targets blacklist, never touch such entities or you will go in the CF blacklist flawlessy)
  • try to reproduce a real life navigation in all aspects, including errors, waitings and more
  • check your used ip after any scrape against popular blacklists otherwise bad errors will shortly appears (crowdsec is a good starting point)
  • the usual scrape is a googlebot scrape, a single regex WAF rule on CLoudflare will block 99,99% of the tries then.. avoid to fake as google and try to be LESS evil instead (ex: asking webmasters for APIs or data export if any).

Source: I use Cloudflare with hundreds of domains and thousands of records (Enterprise) from the beginning of the company.

That way you will be closer to the point (and you will help them increasing the overall security).

like image 22
fab23 Avatar answered Nov 26 '25 09:11

fab23



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!