BigQueryにエクスポートされたGA4のパラメータの値は、その中身によって格納されるフィールドが変わるので、クエリを書くときにはパラメータがどのフィールドに格納されているかを把握していないと正しい結果を得られない場合がある。
このため、事前に以下のクエリを叩くことで、各パラメータがどのフィールドに格納されたかを調査できる。
パラメータが格納される先のバリエーション
- イベントパラメータ
- event_params.value.string_value
- event_params.value.int_value
- event_params.value.double_value
- event_params.value.float_value
- ユーザープロパティ
- user_properties.value.string_value
- user_properties.value.int_value
- user_properties.value.double_value
- user_properties.value.float_value
調査用のクエリ
※ PROJECT-NAME.analytics_123456789
と対象日付の部分を変更の上使用
WITH base AS (
SELECT
event_params
, user_properties
FROM `PROJECT-NAME.analytics_123456789.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20230101' AND '20231231'
)
, param_type_summary AS (
SELECT
'event' AS scope
, ep.key AS param_key
, COALESCE(
IF(ep.value.string_value IS NOT NULL, 'string', NULL)
, IF(ep.value.int_value IS NOT NULL, 'int', NULL)
, IF(ep.value.float_value IS NOT NULL, 'float', NULL)
, IF(ep.value.double_value IS NOT NULL, 'double', NULL)
) AS param_type
, COUNT(*) AS total_events
FROM base, UNNEST(event_params) as ep
GROUP BY 1, 2, 3
UNION ALL
SELECT
'user' AS scope
, up.key AS param_key
, COALESCE(
IF(up.value.string_value IS NOT NULL, 'string', NULL)
, IF(up.value.int_value IS NOT NULL, 'int', NULL)
, IF(up.value.float_value IS NOT NULL, 'float', NULL)
, IF(up.value.double_value IS NOT NULL, 'double', NULL)
) AS param_type
, COUNT(*) AS total_events
FROM base, UNNEST(user_properties) as up
GROUP BY 1, 2, 3
)
SELECT
scope
, param_key
, COUNT(*) AS type_variations
, SUM(total_events) AS total_events
, SUM(CASE WHEN param_type='string' THEN total_events ELSE 0 END) AS type_string_events
, SUM(CASE WHEN param_type='int' THEN total_events ELSE 0 END) AS type_int_events
, SUM(CASE WHEN param_type='float' THEN total_events ELSE 0 END) AS type_float_events
, SUM(CASE WHEN param_type='double' THEN total_events ELSE 0 END) AS type_double_events
FROM param_type_summary
GROUP BY 1, 2
ORDER BY 1, 3 DESC, 4 DESC, 2;
上記スクリーンショットの結果では以下のことが分かる。
- パラメータ
page_group
の値はすべてevent_params.value.string_value
に格納されていること - パラメータ
search_term
の値はevent_params.value.string_value
,event_params.value.int_value
の何れかに格納されていること
値が入るフィールドが固定されないパラメータの対策
値がどのフィールドに入るか予測できないパラメータはCOALESCEを使ってどのフィールドに入っても処理できるようにすると安心。
SELECT DISTINCT
(SELECT COALESCE(value.string_value, SAFE_CAST(value.int_value AS string),SAFE_CAST(value.float_value AS string),SAFE_CAST(value.double_value AS string)) FROM UNNEST(event_params) WHERE key = 'search_term') AS ep_search_term
FROM
`PROJECT-NAME.analytics_123456789.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20230101' and '20231231'
ORDER BY 1
WITH search_term_events AS (
SELECT
event_name
, ep.value.string_value AS string_value
, SAFE_CAST(ep.value.int_value AS STRING) AS int_value
, SAFE_CAST(ep.value.float_value AS STRING) AS float_value
, SAFE_CAST(ep.value.double_value AS STRING) AS double_value
, ep.key
FROM `PROJECT-NAME.analytics_123456789.events_*`,
UNNEST(event_params) as ep
WHERE
_TABLE_SUFFIX BETWEEN '20230101' AND '20231231'
AND ep.key = 'search_term'
)
SELECT
event_name
, COALESCE(string_value, int_value, float_value, double_value) AS ep_search_term
, COUNT(*)
FROM search_term_events
WHERE COALESCE(string_value, int_value, float_value, double_value) IS NOT NULL
GROUP BY 1, 2
ORDER BY 1, 3 DESC