Can't bypass cloudflare with python cloudscraper

Question

I faced with cloudflare issue when I tried to parse the website.

I got this code

import cloudscraper

url = "https://author.today"
scraper = cloudscraper.create_scraper()
print(scraper.post(url).status_code)

This code prints me

cloudscraper.exceptions.CloudflareChallengeError: Detected a Cloudflare version 2 challenge, This feature is not available in the opensource (free) version.

I searched for workaround, but couldn't find any solution. If visit the website via a browser you could see

Checking your browser before accessing author.today.

Is there any solution to bypass cloudflare in my case?

Zorome · Accepted Answer

Install httpx

pip3 install httpx[http2]

Define http2 client

client = httpx.Client(http2=True)

Make request

response = client.get("https://author.today")

Cheers!

fab23 · Answer

I can suggest such workflow to "try" to avoid Cloudflare WAF/bot mitigation:

don't cycle user agents, proxies or weird tunnels to surf
don't use fixed ip addresses, better leased lines like xDSL, home links and 4G/LTE
try to appear as mobile instead of a desktop/tablet
try to reproduce pointer movements like never before AKA record your mouse moves and migrate them 1:1 while scraping (yes u need JS enabled and some headless browser able to make up as "common" one)
don't cycle against different Cloudflare protected entities otherwise the attacker ip will be greylisted in a minute (AKA build your own targets blacklist, never touch such entities or you will go in the CF blacklist flawlessy)
try to reproduce a real life navigation in all aspects, including errors, waitings and more
check your used ip after any scrape against popular blacklists otherwise bad errors will shortly appears (crowdsec is a good starting point)
the usual scrape is a googlebot scrape, a single regex WAF rule on CLoudflare will block 99,99% of the tries then.. avoid to fake as google and try to be LESS evil instead (ex: asking webmasters for APIs or data export if any).

Source: I use Cloudflare with hundreds of domains and thousands of records (Enterprise) from the beginning of the company.

That way you will be closer to the point (and you will help them increasing the overall security).

Can't bypass cloudflare with python cloudscraper

Tags:

python

web-scraping

cloudflare

Nickolas

2 Answers

Zorome

fab23

Recent Activity

Donate For Us

Can't bypass cloudflare with python cloudscraper

Tags:

python

web-scraping

cloudflare

Nickolas

2 Answers

Zorome

fab23

Related questions

Recent Activity

Donate For Us