Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Searching in R

Tags:

r

This question is not so much about how to google search in R (discussed many times before) as much as why it does not always work.

I found this code from another posted question here That I recall working perfectly. It would produce all the links in the search.

But now it does not work. For some reason the node is not there anymore when I pull the data into R. But when I actually inspect the html code on Chrome it's there when I am browsing the code. It show's the h3 node in the display inspector but not when it's being downloded.

library(rvest)
ht <- read_html('https://www.google.co.in/search?q=guitar+repair+workshop')
links <- ht %>% html_nodes(xpath='//h3/a') %>% html_attr('href')
gsub('/url\\?q=','',sapply(strsplit(links[as.vector(grep('url',links))],split='&'),'[',1))

I get the following return:

 character(0)

The google page display of links depends on your location/preferences. So maybe this is what is causing the issue?

like image 745
jessica Avatar asked Dec 08 '25 09:12

jessica


1 Answers

It appears that the format switched very recently, maybe today, and that the //h3 is no longer used. This produces what is intended with one final extraneous result

    library(rvest)
    ht <- read_html('https://www.google.co.in/search?q=guitar+repair+workshop')
    links <- ht %>% html_nodes(xpath='//a') %>% html_attr('href')
    gsub('/url\\?q=','',sapply(strsplit(links[as.vector(grep('url',links))],split='&'),'[',1))
like image 58
Ross Gore Avatar answered Dec 09 '25 22:12

Ross Gore



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!