LoginSignup
1
0

More than 5 years have passed since last update.

BigQuery 大量の時間データから利用時間を求めてみた

Last updated at Posted at 2017-01-18

とある興味からシステムの利用時間を調べたくなったので考えた

テストデータ(大量の記録された時間データがあると想定)

SELECT
 *
FROM
 (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -650 AS time_at),
 (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -600 AS time_at),
 (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -200 AS time_at),
 (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -150 AS time_at),
 (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -50  AS time_at),
 (SELECT TIMESTAMP_TO_SEC(current_timestamp())      AS time_at)
;

image

LAG関数を使って1つ前の時間を取得する

SELECT
 time_at,
 LAG(time_at, 1) OVER (ORDER BY time_at asc) as before_time_at
FROM
 (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -650 AS time_at),
 (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -600 AS time_at),
 (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -200 AS time_at),
 (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -150 AS time_at),
 (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -50  AS time_at),
 (SELECT TIMESTAMP_TO_SEC(current_timestamp())      AS time_at)
;

image

1つ前の時間との差分を計算

SELECT
 time_at,
 before_time_at,
 (time_at - before_time_at) AS diff_sec
FROM
 (
 SELECT
  time_at,
  LAG(time_at, 1) OVER (ORDER BY time_at asc) as before_time_at
 FROM
  (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -650 AS time_at),
  (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -600 AS time_at),
  (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -200 AS time_at),
  (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -150 AS time_at),
  (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -50  AS time_at),
  (SELECT TIMESTAMP_TO_SEC(current_timestamp())      AS time_at)
 )
;

image

5分以上の未操作は除外してみる

SELECT
 time_at,
 before_time_at,
 IF((time_at - before_time_at) > (5 * 60), null, (time_at - before_time_at)) AS diff_sec
FROM
 (
 SELECT
  time_at,
  LAG(time_at, 1) OVER (ORDER BY time_at asc) as before_time_at
 FROM
  (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -650 AS time_at),
  (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -600 AS time_at),
  (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -200 AS time_at),
  (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -150 AS time_at),
  (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -50  AS time_at),
  (SELECT TIMESTAMP_TO_SEC(current_timestamp())      AS time_at)
 )
;

image

後は集計

SELECT
 SUM(diff_sec)
FROM
 (
 SELECT
  time_at,
  before_time_at,
  IF((time_at - before_time_at) > (5 * 60), null, (time_at - before_time_at)) AS diff_sec
 FROM
  (
  SELECT
   time_at,
   LAG(time_at, 1) OVER (ORDER BY time_at asc) as before_time_at
  FROM
   (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -650 AS time_at),
   (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -600 AS time_at),
   (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -200 AS time_at),
   (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -150 AS time_at),
   (SELECT TIMESTAMP_TO_SEC(current_timestamp()) -50  AS time_at),
   (SELECT TIMESTAMP_TO_SEC(current_timestamp())      AS time_at)
  )
 )
;

image


おまけ

BigQueryで日付の計算する時は、unixtimeで計算した方が良いぞ。
理由は忘れたが、変な動作したはず。

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0