Unnest GA4 user properties in BigQuery
User properties in Google Analytics 4 (GA4) is a kind of replacement of user scope custom dimensions in Google Analytics 3. Now with GA4 and BigQuery free import the sky is the limit when it comes to flexibility of data modelling. However user properties are nested in the BigQuery dataset. Let’s dig deeper into the GA4 data modelling and unnest them.
Here I just describe a relatively simple model that allows to create a dynamic report using Data Studio or any other BI tools. Let’s imagine we have an ecommerce website that sell clothes (cool ones, of course!). For each customer we send a user dimension with a favorite color. It is called user_favorite_color. Of course, we want to know who buy more: fans of black, white, orange, green or yellow.
The first step is to unnest the user property user_favorite_color. For each user_pseudo_id we get the value of the user_favorite_color property.
select
parse_date('%Y%m%d',event_date) as date,
user_pseudo_id,
user_properties.value.string_value as user_favorite_color
from
`your-project.analytics_xxxx.events_*`,
unnest(user_properties) as user_properties
where
user_properties.key = 'user_favorite_color'
and user_properties.value.string_value is not null
group by 1,2,3
The second step is to get events we are interested in for each user_pseudo_id. I don’t count events here, but set user_pseudo_id for each event. So I can count distinct users for a dynamic range automatically using a dashboard tool (Data Studio, Metabase, Tableau etc.)
select
parse_date('%Y%m%d',event_date) as date,
user_pseudo_id,
ecommerce.purchase_revenue as revenue,
case when event_name = 'begin_checkout' then user_pseudo_id end as begin_checkout,
case when event_name = 'purchase' then user_pseudo_id end as purchase
from
`your-project. analytics_xxxx.events_*`,
The third step is to put it all together into a query. Note: this query can be shorter and more efficient. We are sacrificing efficiency for clarity here.
create or replace table `ga4_models.user_colors` as (
with user_colors as (
select
parse_date('%Y%m%d',event_date) as date,
user_pseudo_id,
user_properties.value.string_value as user_favorite_color
from
`your-project. analytics_xxxx.events_*`,
unnest(user_properties) as user_properties
where
user_properties.key = 'user_favorite_color' and user_properties.value.string_value is not null
group by 1,2,3
),ecommerce_events as (
select
parse_date('%Y%m%d',event_date) as date,
user_pseudo_id,
ecommerce.purchase_revenue as revenue,
case when event_name = 'begin_checkout' then user_pseudo_id end as begin_checkout,
case when event_name = 'purchase' then user_pseudo_id end as purchase
from
`your-project. analytics_xxxx.events_*`
)select
a.date,
a.user_pseudo_id,
b.user_favorite_color,
a.begin_checkout,
a.purchase,
max(revenue) as revenue
from ecommerce_events a
left join user_colors b on a.date = b.date and a.user_pseudo_id = b.user_pseudo_id
group by 1,2,3,4,5
)
The fourth step is to connect the BigQuery table to a BI tool.
Here is a Data Studio example of a possible output (data is not real):
Customers who prefer the yellow color buy more. We should definitely sell more yellow T-Shirts!