Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting row entries in BigQuery table grouped on a time interval using SQL

I've hit a bit of a dead end with this BigQuery/SQL question. After ~1 hour of Googling I still haven't figured it out, so I figured I'd ask here.

I have a BigQuery table (mycompany.engagement.product_orders) of customer order data. Every row in the table describes an order placed by a customer and it looks something like this:

Row Product Timestamp Type CustomerName
1 Apple 2021-08-19 11:41:08.874 UTC Gala Philippe Kahn
2 Orange 2021-08-19 11:41:12.874 UTC Navel Grace Hopper
3 Pear 2021-08-19 11:41:24.874 UTC Bosc Vladimir Nabokov
4 Apple 2021-08-19 11:41:47.874 UTC Melba Sylvia Plath
5 Pear 2021-08-19 11:41:55.874 UTC Anjou Alan Turing
6 Pear 2021-08-19 11:42:10.874 UTC Asian Sylvia Plath
7 Apple 2021-08-19 11:42:11.874 UTC Fuji Vladimir Nabokov
8 Orange 2021-08-19 11:42:37.874 UTC Blood Ada Lovelace
9 Orange 2021-08-19 11:42:49.874 UTC Cara Grace Hopper
10 Apple 2021-08-19 11:42:51.874 UTC Melba Alan Turing

I would like to formulate a SQL query that will count the products ordered by customers in 1 minute intervals (or any interval really) to return a table that looks (something) like this:

Row Product Timestamp Count
1 Apple 2021-08-19 11:41:00.000 UTC 2
2 Orange 2021-08-19 11:41:00.000 UTC 1
3 Pear 2021-08-19 11:41:00.000 UTC 2
4 Pear 2021-08-19 11:42:00.000 UTC 1
5 Apple 2021-08-19 11:42:00.000 UTC 2
6 Orange 2021-08-19 11:42:00.000 UTC 2

Some notes:

The examples I found that were relevant (eg: https://dba.stackexchange.com/questions/179823/grouping-count-by-interval-of-15-minutes) tend to provide counts for all row entries, not aggregated by a column value on the increment. I know this might be possible using partition by or group by statements but I'm not entirely sure or how I would structure them together. If it's not possible that'd be great to know - my SQL skills are still pretty nascent.

Attempt: Following the framework of the link posted above which is somewhat similar to mine:

SELECT 
    DATE_ADD(MINUTE, (DATEDIFF(MINUTE, '20000101', timestamp) / 1)*1, '20000101'),
    count(*)
FROM 
    mycompany.engagement.product_orders
GROUP BY
    DATE_ADD(MINUTE, (DATEDIFF(MINUTE, '20000101', timestamp) / 1)*1, '20000101')

Returns:

Unrecognized name: MINUTE at [2:14]

like image 898
randomdatascientist Avatar asked Oct 28 '25 04:10

randomdatascientist


2 Answers

Consider below approach

select Product, 
  timestamp_trunc(Timestamp, minute) Timestamp,
  count(1) `Count`
from `mycompany.engagement.product_orders`
group by 1, 2        

if applied to sample data in your question - output is

enter image description here

like image 128
Mikhail Berlyant Avatar answered Oct 29 '25 18:10

Mikhail Berlyant


You want to use date_trunc():

SELECT DATE_TRUNC(timestamp, MINUTE) as tm,
       COUNT(*)
FROM  mycompany.engagement.product_orders
GROUP BY tm;
like image 26
Gordon Linoff Avatar answered Oct 29 '25 17:10

Gordon Linoff



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!