The old BigQuery Export schema wise script is running.It is given below. But I want to replicate this code and write it according to new export schema as we Bigquery schema has been changed. Please help becasue in new BigQuery Export schema I don't find any other corresponding record against event_dim (event_dim is in according to old BigQuery Export schema).
Here is link for BigQuery Export schema: click here
SELECT user_dim.app_info.app_instance_id
, (SELECT MIN(timestamp_micros) FROM UNNEST(event_dim)) min_time
, (SELECT MAX(timestamp_micros) FROM UNNEST(event_dim)) max_time,
event.name,
params.value.int_value engagement_time
FROM `xxx.app_events_*`,
UNNEST(event_dim) as event,
UNNEST(event.params) as params,
UNNEST(user_dim.user_properties) as user_params
where (event.name = "user_engagement" and params.key = "engagement_time_msec")
and
(user_params.key = "access" and user_params.value.value.string_value = "true") and
PARSE_DATE('%Y%m%d', event.date) >= date_sub("{{upto_date (yyyy-mm-dd)}}", interval {{last n days}} day) and
PARSE_DATE('%Y%m%d', event.date) <= "{{upto_date (yyyy-mm-dd)}}"
Tried the query below but what I want app_instance, min_time, max_time, event_name, engagement_time at one SELECT statement. And as I am using 'group by', I am not able to get all those (app_instance, min_time, max_time, event_name, engagement_time) at a time. Please help.
SELECT user_pseudo_id
, MIN(event_timestamp) AS min_time
,MAX(event_timestamp) AS max_time
FROM `xxx.app_events_*` as T,
T.event_params,
T.user_properties,
T.event_timestamp
where (event_name = "user_engagement" and event_params.key = "engagement_time_msec")
and
(user_properties.key = "access" and user_properties.value.string_value = "true") and
PARSE_DATE('%Y%m%d', event_date) >= date_sub("{{upto_date (yyyy-mm-dd)}}", interval {{last n days}} day) and
PARSE_DATE('%Y%m%d', event_date) <= "{{upto_date (yyyy-mm-dd)}}"
group by 1
As I believe my previous answer provides some general ideas for the Community, I will keep it and write a new one in order to be more specific for your use case.
First of all, I would like to clarify that in order to adapt a query (just like you are asking us to do), one needs to have a clear understanding of the statement, objective of the query, expected results and data to play with. As this is not the case, it is difficult to work with it, even more considering that there are some functionalities that are not clear from the query, for example: in order to obtain the "min_time" and "max_time" for each event, you are taking the min and max value across multiple events, which does not make clear sense to me (it may, depending on your use case, reason why I suggested that it would be better if you could provide more details or work more on the query yourself). Moreover, the new schema "flattens" events, in such a way that each event is written in a different line (you can easily check this by running a SELECT COUNT(*) FROM 'table_with_old_schema'
and compare it to SELECT COUNT(*) FROM 'table_with_new_schema'
; you will see that the second one has many more rows), so your query does not make sense anymore, because events are not grouped anymore, and then you cannot pick a minimum and maximum between nested fields.
This being clarified, and having removed some fields that cannot be directly adapted to the new schema (you may be able to adapt this from your side, but this would require some additional effort and understanding of what did those fields mean to you in your previous query), here there are two queries that provide exactly the same results, when run against the same table, with different schema:
Query against a table with the old schema:
SELECT
user_dim.app_info.app_instance_id,
event.name,
params.value.int_value engagement_time
FROM
`DATASET.app_events_YYYYMMDD`,
UNNEST(event_dim) AS event,
UNNEST(event.params) AS params,
UNNEST(user_dim.user_properties) AS user_params
WHERE
(event.name = "user_engagement"
AND params.key = "engagement_time_msec")
AND (user_params.key = "plays_quickplay"
AND user_params.value.value.string_value = "true")
ORDER BY 1, 2, 3
Query against the same table, with the new schema:
SELECT
user_pseudo_id,
event_name,
params.value.int_value engagement_time
FROM
`DATASET.events_YYYYMMDD`,
UNNEST(event_params) AS params,
UNNEST(user_properties) AS user_params
WHERE
(event_name = "user_engagement"
AND params.key = "engagement_time_msec")
AND (user_params.key = "plays_quickplay"
AND user_params.value.string_value = "true")
ORDER BY 1, 2, 3
Again, for this I am using the following table from the public dataset: firebase-public-project.com_firebase_demo_ANDROID.app_events_YYYYMMDD
, so I had to change some filters and remove some others in order for it to retrieve sensible results against that table. Therefore, feel free to modify or add the ones you need in order for it to be useful for your use case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With