I'm looking for a generic method to compare two tables in BigQuery, even if they have columns that are STRUCT type.
It should work for any pair of tables, and ideally wouldn't involve writing a query that depends on that actual columns of the tables. All I really need to know is if the tables are equal or not, but it would be a bonus if it could show me the difference between the rows that aren't the same.
So something like (in pseudo code)
sizeOf( TABLE A EXCEPT TABLE B ) == 0
or
Hash(TABLE A) == HASH(TABLE B)
Would be fine.
I tried using this:
( SELECT * FROM table1
EXCEPT DISTINCT
SELECT * FROM table2)
UNION ALL
( SELECT * FROM table2
EXCEPT DISTINCT
SELECT * FROM table1)
But I got this error.
Column 1 in EXCEPT ALL has type that does not support set operation comparisons: STRUCT at [3:5]
Does anyone know of a way to get around this?
Should have mentioned before, but I need this to work regardless of the ordering of the rows of the table.
I think yo are looking for something like below to start with
#standardSQL
SELECT TO_JSON_STRING(a) FROM `project.dataset.tableA` a
EXCEPT DISTINCT
SELECT TO_JSON_STRING(b) FROM `project.dataset.tableB` b
Or, more complete example - to show differences - note: this can be quite exhausting output for really different tables
#standardSQL
SELECT 'a' table, * FROM (
SELECT TO_JSON_STRING(a) record FROM `project.dataset.tableA` a
EXCEPT DISTINCT
SELECT TO_JSON_STRING(b) FROM `project.dataset.tableB` b
)
UNION ALL
SELECT 'b', * FROM (
SELECT TO_JSON_STRING(b) FROM `project.dataset.tableB` b
EXCEPT DISTINCT
SELECT TO_JSON_STRING(a) FROM `project.dataset.tableA` a
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With