Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

streaming PostgreSQL tables into Google BigQuery

I would like to automatically stream data from an external PostgreSQL database into a Google Cloud Platform BigQuery database in my GCP account. So far, I have seen that one can query external databases (MySQL or PostgreSQL) with the EXTERNAL_QUERY() function, e.g.: https://cloud.google.com/bigquery/docs/cloud-sql-federated-queries

enter image description here

But for that to work, the database has to be in GCP Cloud SQL. I tried to see what options are there for streaming from the external PostgreSQL into a Cloud SQL PostgreSQL database, but I could only find information about replicating it in a one time copy, not streaming: https://cloud.google.com/sql/docs/mysql/replication/replication-from-external

The reason why I want this streaming into BigQuery is that I am using Google Data Studio to create reports from the external PostgreSQL, which works great, but GDS can only accept SQL query parameters if it comes from a Google BigQuery database. E.g. if we have a table with 1M entries, and we want a Google Data Studio parameter to be added by the user, this will turn into a:

SELECT * from table WHERE id=@parameter;

which means that the query will be faster, and won't hit the 100K records limit in Google Data Studio.

What's the best way of creating a connection between an external PostgreSQL (read-only access) and Google BigQuery so that when querying via BigQuery, one gets the same live results as querying the external PostgreSQL?

like image 562
719016 Avatar asked Sep 05 '25 03:09

719016


1 Answers

Perhaps you missed the options stated on the google cloud user guide?

https://cloud.google.com/sql/docs/mysql/replication/replication-from-external#setup-replication

Notice in this section, it says:

"When you set up your replication settings, you can also decide whether the Cloud SQL replica should stay in-sync with the source database server after the initial import is complete. A replica that should stay in-sync is online. A replica that is only updated once, is offline."

I suspect online mode is what you are looking for.

like image 132
Jacky Cheng Avatar answered Sep 07 '25 19:09

Jacky Cheng