Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trying to login into Google in order to download Google Trends data

Tags:

php

curl

I am trying to:

  1. Login to Google
  2. Download CSV data from Google Trends

I am succeeding in (1) but not in (2). I get returned an authorization-token from Google, and am sending it with the subsequent request to Trends, but nevertheless Google then returns an error: "You must be signed in to export data from Google Trends":

// http://code.google.com/apis/accounts/docs/AuthForInstalledApps.html
$data = array(
  'accountType' => 'GOOGLE',
  'Email'       => '[email protected]',
  'Passwd'      => 'my.password',
  'service'     => 'trendspro',
  'source'      => 'company-application-1.0'
);

$ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, "https://www.google.com/accounts/ClientLogin");
  curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
  curl_setopt($ch, CURLOPT_HTTPAUTH, false);
  curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  $response = curl_exec($ch);

  preg_match("/Auth=([a-z0-9_\-]+)/i", $response, $matches);

  // We now have an authorization-token
  $headers = array(
    "Authorization: GoogleLogin auth=" . $matches[1],
    "GData-Version: 3.0"
  );

  curl_setopt($ch, CURLOPT_URL, "http://www.google.com/trends/viz?q=MSFT&date=2011-2&geo=all&graph=all_csv&sort=0&sa=N");
  curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
  curl_setopt($ch, CURLOPT_HEADER, false);
  curl_setopt($ch, CURLOPT_POST, false);
  $csv = curl_exec($ch);
curl_close($ch);

// Returns : "You must be signed in to export data from Google Trends"
// Expected: CSV data stream
print_r($csv);

For some reason, the auth-tokens I am sending to Google Trends, is not being accepted or ignored. I don't know exactly what happens, since no additional error-information is given.

Does anyone see what I am doing wrong? If you can get it to work, meaning that Google is returning the CSV data, then the bounty is yours and we both have a late Christmas present :-)


So I figured out the problem has nothing to do with cURL. What I did was:

  • Go to https://www.google.com/accounts/ClientLogin?accountType=GOOGLE&[email protected]&Passwd=my.password&service=trendspro&source=ding-dang-1. The return is:
SID=DQAAAMUAAADMqt...aYPaYniC_iW
LSID=DQAAAMcAAACI5...YDTBDt_xZC9
Auth=DQAAAMgAAABm8...trXgqNv-g0H
  • I copy the returned Auth token: DQAAAMgAAABm8...trXgqNv-g0H
  • I send a GET request using the POSTman Chrome extension to http://www.google.com/trends/viz?q=MSFT&date=2011-2&geo=all&graph=all_csv&sort=0&sa=N using the header:
GData-Version: 3.0     
Authorization: GoogleLogin auth=DQAAAMgAAABm8...trXgqNv-g0H
  • I get returned:

headers:

Date: Tue, 27 Dec 2011 00:17:20 GMT
Content-Encoding: gzip
Content-Disposition: filename=trends.csv
Content-Length: 97
X-XSS-Protection: 1; mode=block
Server: Google Trends
X-Frame-Options: SAMEORIGIN
Content-Type: text/csv; charset=UTF-8
Cache-Control: private

data:

You must be signed in to export data from Google Trends

In other words, I'm sending headers as defined by Google on http://code.google.com/apis/accounts/docs/AuthForInstalledApps.html but no luck getting a proper return. An there is about * no * info on the Interwebs concerning this. Who knows what the problem is here?

like image 815
Pr0no Avatar asked Dec 24 '11 19:12

Pr0no


2 Answers

After checking your code, the problem is that Google Trends needs the SID key and not Auth. Here's the code I wrote to download the csv's

<?php

header('content-type: text/plain');

// Set account login info
$data['post'] = array(
  'accountType' => 'HOSTED_OR_GOOGLE',  // indicates a Google account
  'Email'       => '',  // full email address
  'Passwd'      => '',
  'service'     => 'trendspro', // Name of the Google service
  'source'      => 'codecri.me-example-1.0' // Application's name, e.g. companyName-applicationName-versionID
);

$response = xhttp::fetch('https://www.google.com/accounts/ClientLogin', $data);

// Test if unsuccessful
if(!$response['successful']) {
    echo 'response: '; print_r($response);
    die();
}

// Extract SID
preg_match('/SID=(.+)/', $response['body'], $matches);
$sid = $matches[1];

// Erase POST variables used on the previous xhttp call
$data = array();

// Set the SID in cookies
$data['cookies'] = array(
    'SID' => $sid
);

This uses my xhttp class, a cURL wrapper.

like image 86
Arvin Avatar answered Oct 27 '22 20:10

Arvin


Right tool for the right job, have you considered PhantomJS?

It could be even more readable.

like image 44
Kamil Tomšík Avatar answered Oct 27 '22 19:10

Kamil Tomšík



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!