Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Crawling Google Search with PHP

I trying to get my head around how to fetch Google search results with PHP or JavaScript. I know it has been possible before but now I can't find a way.

I am trying to duplicate (somewhat) the functionality of
http://www.getupdated.se/sokmotoroptimering/seo-verktyg/kolla-ranking/

But really the core issue I want to solve is just to get the search result via PHP or JavaScript,the rest i can figure out.

Fetching the results using file_get_contents() or cURL doesn't seem to work.

Example:

$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, 'http://www.google.se/#hl=sv&q=dogs');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$result = curl_exec($ch);
curl_close($ch);
echo '<pre>';
var_dump($result);
echo '</pre>';

Results:

string(219) "302 Moved The document has moved here."

So, with some Googling i found http://code.google.com/apis/customsearch/v1/overview.html but that seems to only work for generating a custom search for one or more websites. It seem to require a "Custom Search Engine" cx-parameter passed.

So anyway, any idea?

like image 253
jamietelin Avatar asked Dec 19 '25 01:12

jamietelin


2 Answers

I did it earlier. Generate the html contents by making https://www.google.co.in/search?hl=en&output=search&q=india http request, now parse specific tags using the htmldom php library. You can parse the content of result page using PHP SIMPLE HTML DOM

DEMO : Below code will give you title of all the result :

<?php

include("simple_html_dom.php");

$html = file_get_html('http://www.google.co.in/search?hl=en&output=search&q=india');

$i = 0;
foreach($html->find('li[class=g]') as $element) {
    foreach($element->find('h3[class=r]') as $h3) 
    {
        $title[$i] = '<h1>'.$h3->plaintext.'</h1>' ;
    }
       $i++;
}
print_r($title);

?>
like image 75
Hardik Thaker Avatar answered Dec 21 '25 16:12

Hardik Thaker


There is php a github package named google-url that does the job.

The api is very comfortable to use. See the example :

// this line creates a new crawler
$googleUrl=new \GoogleURL\GoogleUrl();
$googleUrl->setLang('en'); // say for which lang you want to search (it could have been "fr" instead)
$googleUrl->setNumberResults(10); // how many results you want to check
// launch the search for a specific keyword
$results = $googleUrl->search("google crawler");
// finaly you can loop on the results (an example is also available on the github page)

However you will have to think to use a delay between each query, or else google will consider you as a bot and ask you for a captcha that will lock the script.

like image 22
Soufiane Ghzal Avatar answered Dec 21 '25 15:12

Soufiane Ghzal