Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BigQuery Window ORDER BY is not allowed if DISTINCT is specified

I'm investigating porting some bigquery legacy sql containing windowed distinct counts like this

count(distinct brand_id) over (partition by user_id order by order_placed_at range between 7 * 24 * 60 * 60 * 1000000 PRECEDING AND 1 PRECEDING) as last_7_day_buyer_brands

to Standard sql.... but I get this error....

Window ORDER BY is not allowed if DISTINCT is specified

For reference I've tried APPROX_COUNT_DISTINCT function with no luck.

Is there a better way to get this to work other than write the subqueries and group by's?

Most of the other queries have ported to standard sql with only minor changes.

like image 800
Scott Page Avatar asked Oct 21 '25 04:10

Scott Page


1 Answers

Per documentation

OVER clause requirements:

PARTITION BY: Optional.
ORDER BY: Optional. Disallowed if DISTINCT is present.
window_frame_clause: Optional. Disallowed if DISTINCT is present.

note: above is 'highlighted' by me, not as in documentation

As you can see not only ORDER BY but even RANGE BETWEEN is not allowed when DISTINCT is used

I think, subquery is the way to go.

In case if you need direction for this, use below simple example

#standardSQL
SELECT
  user_id,
  order_placed_at,
  brand_id,
  (SELECT COUNT(DISTINCT brand) 
      FROM UNNEST(last_7_day_buyer_brands_with_dups) AS brand
  ) AS last_7_day_buyer_brands
FROM (
  SELECT 
    user_id,
    order_placed_at,
    brand_id,
    ARRAY_AGG(brand_id) OVER(
      PARTITION BY user_id ORDER BY order_placed_at 
      RANGE BETWEEN 7 * 24 * 60 * 60 * 1000000 PRECEDING AND 1 PRECEDING
    ) AS last_7_day_buyer_brands_with_dups
  FROM yourTable
)
like image 126
Mikhail Berlyant Avatar answered Oct 24 '25 14:10

Mikhail Berlyant