I use a scheduled query in Big Query which appends data from the previous day to a Big Query table. The data from the previous day is not always available when my query runs, so, to make sure that I have all the data, I need to calculate the last date available in my Big Query table.
My first attempt was to write the following query :
SELECT *
FROM sourceTable
WHERE date >= (SELECT Max(date) from destinationTable)
When I run this query, only date >= max(date) is correctly exported. However, the query processes the entire sourceTable, and not only J - max(date). Therefore, the cost is higher than expected.
I also tried to declare a variable using "DECLARE" & "SET" (https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting). This solution works fine and only J - max(date) is processed. However, BQ interprets a query with "DECLARE" as a script, so the results can't be exported automatically to a BQ table using scheduled queries.
DECLARE maxDate date;
SET maxDate = (SELECT Max(date) from destinationTable);
SELECT *
FROM sourceTable
WHERE date >= maxDate
Is there another way of doing what I would like? Or a way to declare a variable using "DECLARE" & "SET" in a scheduled query with a destination table?
Thanks!
Scripting query, when being scheduled, doesn't support setting a destination table for now. You need to use DDL/DML to make change to existing table.
DECLARE maxDate date;
SET maxDate = (SELECT Max(date) from destinationTable);
CREATE OR REPLACE destinationTable AS
SELECT *
FROM sourceTable
WHERE date >= maxDate
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With