I've encountered a very strange problem while using rvest. This is one of the examples: https://politics.raisethemoney.com/cchristiansen. This pages opens normally in any web browser, and is open-able by base::url.
A connection with
description "https://politics.raisethemoney.com/cchristiansen"
class "url-libcurl"
mode "r"
text "text"
opened "closed"
can read "yes"
can write "no"
When xml2::read_html is used, it gives a 404 error.
Error in open.connection(x, "rb") : HTTP error 404.
Tested on both Rstudio Cloud and a local machine (Windows 10). I'm baffled. Any ideas why this may be happening?
The server is looking for a specific header in the request i.e.
'Accept' : ''
This needs to be provided in order for the request to be given 200 from server. This header is a default one within httr for example but I assume you don't have this with methods you are trying.
Here are some quick tests I ran with Python requests (somewhat similar to rvest):

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With