LoginSignup
0
0

BQ Exportされたパラメータの値がどのフィールドに格納されているか調べる

Last updated at Posted at 2023-12-05

BigQueryにエクスポートされたGA4のパラメータの値は、その中身によって格納されるフィールドが変わるので、クエリを書くときにはパラメータがどのフィールドに格納されているかを把握していないと正しい結果を得られない場合がある。
このため、事前に以下のクエリを叩くことで、各パラメータがどのフィールドに格納されたかを調査できる。

パラメータが格納される先のバリエーション

  • イベントパラメータ
    • event_params.value.string_value
    • event_params.value.int_value
    • event_params.value.double_value
    • event_params.value.float_value
  • ユーザープロパティ
    • user_properties.value.string_value
    • user_properties.value.int_value
    • user_properties.value.double_value
    • user_properties.value.float_value

調査用のクエリ

PROJECT-NAME.analytics_123456789 と対象日付の部分を変更の上使用

WITH base AS (
  SELECT 
    event_params
    , user_properties
  FROM `PROJECT-NAME.analytics_123456789.events_*`
  WHERE _TABLE_SUFFIX BETWEEN '20230101' AND '20231231'
)
, param_type_summary AS (
  SELECT 
    'event' AS scope
    , ep.key AS param_key
    , COALESCE(
      IF(ep.value.string_value IS NOT NULL, 'string', NULL)
      , IF(ep.value.int_value IS NOT NULL, 'int', NULL)
      , IF(ep.value.float_value IS NOT NULL, 'float', NULL)
      , IF(ep.value.double_value IS NOT NULL, 'double', NULL)
    ) AS param_type
    , COUNT(*) AS total_events
  FROM base, UNNEST(event_params) as ep
  GROUP BY 1, 2, 3
  
  UNION ALL
  
  SELECT 
    'user' AS scope
    , up.key AS param_key
    , COALESCE(
      IF(up.value.string_value IS NOT NULL, 'string', NULL)
      , IF(up.value.int_value IS NOT NULL, 'int', NULL)
      , IF(up.value.float_value IS NOT NULL, 'float', NULL)
      , IF(up.value.double_value IS NOT NULL, 'double', NULL)
    ) AS param_type
    , COUNT(*) AS total_events
  FROM base, UNNEST(user_properties) as up
  GROUP BY 1, 2, 3
)

SELECT
  scope
  , param_key
  , COUNT(*) AS type_variations
  , SUM(total_events) AS total_events
  , SUM(CASE WHEN param_type='string' THEN total_events ELSE 0 END) AS type_string_events
  , SUM(CASE WHEN param_type='int' THEN total_events ELSE 0 END) AS type_int_events
  , SUM(CASE WHEN param_type='float' THEN total_events ELSE 0 END) AS type_float_events
  , SUM(CASE WHEN param_type='double' THEN total_events ELSE 0 END) AS type_double_events
FROM param_type_summary
GROUP BY 1, 2
ORDER BY 1, 3 DESC, 4 DESC, 2;

実行結果の例
image.png

上記スクリーンショットの結果では以下のことが分かる。

  • パラメータpage_groupの値はすべてevent_params.value.string_valueに格納されていること
  • パラメータsearch_termの値はevent_params.value.string_value, event_params.value.int_valueの何れかに格納されていること

値が入るフィールドが固定されないパラメータの対策

値がどのフィールドに入るか予測できないパラメータはCOALESCEを使ってどのフィールドに入っても処理できるようにすると安心。

SELECT DISTINCT
  (SELECT COALESCE(value.string_value, SAFE_CAST(value.int_value AS string),SAFE_CAST(value.float_value AS string),SAFE_CAST(value.double_value AS string)) FROM UNNEST(event_params) WHERE key = 'search_term') AS ep_search_term
FROM              
    `PROJECT-NAME.analytics_123456789.events_*`
WHERE _TABLE_SUFFIX BETWEEN '20230101' and '20231231'
ORDER BY 1
WITH search_term_events AS (
  SELECT
    event_name
    , ep.value.string_value AS string_value
    , SAFE_CAST(ep.value.int_value AS STRING) AS int_value
    , SAFE_CAST(ep.value.float_value AS STRING) AS float_value
    , SAFE_CAST(ep.value.double_value AS STRING) AS double_value
    , ep.key
  FROM `PROJECT-NAME.analytics_123456789.events_*`,
  UNNEST(event_params) as ep
  WHERE 
    _TABLE_SUFFIX BETWEEN '20230101' AND '20231231'
    AND ep.key = 'search_term'
)

SELECT
  event_name
  , COALESCE(string_value, int_value, float_value, double_value) AS ep_search_term
  , COUNT(*)
FROM search_term_events
WHERE COALESCE(string_value, int_value, float_value, double_value) IS NOT NULL
GROUP BY 1, 2
ORDER BY 1, 3 DESC
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0