0
1

More than 1 year has passed since last update.

PipeRiderはdbtの相棒になる

Last updated at Posted at 2022-07-11

本文記載時点はPipeRider 0.4.1を参考にしています。

はじめに

PipeRiderはdbtプロジェクトのトランスフォーメーションされたデータモデルをプロファイリングし、dbtのアサーションとPipeRiderのアサーション実行し、結果によってリポートを生成する

リクワイアメント

範例SQLite Databaseの用意する

curl -o sp500.db https://piperider-data.s3.ap-northeast-1.amazonaws.com/getting-started/sp500_20220401.db

SQLite adapter for dbt

pip install dbt-sqlite

dbt-sqlite用のcrypo library fileを用意する
crypo library fileをダウンロードする

Mac Intelの場合はcrypto.dylib
Mac AppleSiliconの場合はcrypto.arm64.dylib

dbtプロジェクトを初期化する

dbt init
Running with dbt=1.1.1
Enter a name for your project (letters, digits, underscore): sp500_index
Which database would you like to use?
[1] sqlite

(Don't see the one you want? https://docs.getdbt.com/docs/available-adapters)

Enter a number: 1

~/.dbt/profiles.ymlを編集する

extentionsのパス
crypto.arm64.dylib:point_right_tone2: crypto.arm64を入力してください

sp500_index:
  outputs:

    dev:
      type: sqlite
      threads: 1
      database: sp500
      schema: 'main'
      schemas_and_paths:
        main: '/example/path/to/qiita/sp500.db'
      schema_directory: '/example/path/to/qiita/'
      extensions:
        - '/example/path/to/qiita/crypto.arm64'

  target: dev

dbtプロジェクトの設定を確認する

cd  sp500_index
dbt debug --profile sp500_index
Running with dbt=1.1.1
dbt version: 1.1.1
python version: 3.9.12
python path: /example/path/to/qiita/pyenv/bin/python3.9
os info: macOS-12.3-arm64-arm-64bit
Using profiles.yml file at /example/path/to/.dbt/profiles.yml
Using dbt_project.yml file at /example/path/to/qiita/sp500_index/dbt_project.yml

Configuration:
  profiles.yml file [OK found and valid]
  dbt_project.yml file [OK found and valid]

Required dependencies:
 - git [OK found]

Connection:
  database: sp500
  schema: main
  schemas_and_paths: {'main': '/example/path/to/qiita/sp500.db'}
  schema_directory: /example/path/to/qiita/
  Connection test: [OK connection ok]

All checks passed!

dbtでトランスフォーメーション

データモデルにデータをトランスフォーメーションする
第一モデルmodels/example/my_first_dbt_model.sqlを編集する

with source_data as (

    SELECT * FROM PRICE
    LEFT JOIN SYMBOL ON SYMBOL.SYMBOL = PRICE.SYMBOL
    WHERE PRICE.DATE = '2022-01-04' 
)

SELECT *
FROM source_data

第一モデルにより、第二モデルmodels/example/my_second_dbt_model.sqlを編集する

SELECT SYMBOL, DATE, OPEN, HIGH, LOW, CLOSE, VOLUME, ADJCLOSE, MA5, MA20,MA60, NAME, COUNTRY, EXCHANGE_CODE, SECTOR, INDUSTRY
FROM {{ ref('my_first_dbt_model') }}
WHERE COUNTRY = 'United States'

データ審査のルールmodels/example/schema.ymlを編集する

version: 2

models:
  - name: my_first_dbt_model
    description: "Record Date on 2022-01-04"
    columns:
      - name: SYMBOL 
        description: "Stock symbol"
        tests:
          - unique
          - not_null

  - name: my_second_dbt_model
    description: "Stocks in the U.S."
    columns:
      - name: OPEN
        description: "Opening price"
        tests:
          - not_null

データモデルを生成する

dbt build

Output 範例

14:45:58  Running with dbt=1.1.1
14:45:58  Found 2 models, 3 tests, 0 snapshots, 0 analyses, 173 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
14:45:58  
14:45:58  Concurrency: 1 threads (target='dev')
14:45:58  
14:45:58  1 of 5 START view model main.my_first_dbt_model ................................ [RUN]
14:45:58  1 of 5 OK created view model main.my_first_dbt_model ........................... [OK in 0.07s]
14:45:58  2 of 5 START test not_null_my_first_dbt_model_SYMBOL ........................... [RUN]
14:45:58  2 of 5 PASS not_null_my_first_dbt_model_SYMBOL ................................. [PASS in 0.07s]
14:45:58  3 of 5 START test unique_my_first_dbt_model_SYMBOL ............................. [RUN]
14:45:58  3 of 5 PASS unique_my_first_dbt_model_SYMBOL ................................... [PASS in 0.04s]
14:45:58  4 of 5 START view model main.my_second_dbt_model ............................... [RUN]
14:45:58  4 of 5 OK created view model main.my_second_dbt_model .......................... [OK in 0.02s]
14:45:58  5 of 5 START test not_null_my_second_dbt_model_OPEN ............................ [RUN]
14:45:58  5 of 5 PASS not_null_my_second_dbt_model_OPEN .................................. [PASS in 0.04s]
14:45:58  
14:45:58  Finished running 2 view models, 3 tests in 0.40s.
14:45:58  
14:45:58  Completed successfully
14:45:58  
14:45:58  Done. PASS=5 WARN=0 ERROR=0 SKIP=0 TOTAL=5

データモデルをテストする

dbt test

Output 範例

14:46:53  Running with dbt=1.1.1
14:46:53  Found 2 models, 3 tests, 0 snapshots, 0 analyses, 173 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
14:46:53  
14:46:53  Concurrency: 1 threads (target='dev')
14:46:53  
14:46:53  1 of 3 START test not_null_my_first_dbt_model_SYMBOL ........................... [RUN]
14:46:53  1 of 3 PASS not_null_my_first_dbt_model_SYMBOL ................................. [PASS in 0.07s]
14:46:53  2 of 3 START test not_null_my_second_dbt_model_OPEN ............................ [RUN]
14:46:53  2 of 3 PASS not_null_my_second_dbt_model_OPEN .................................. [PASS in 0.07s]
14:46:53  3 of 3 START test unique_my_first_dbt_model_SYMBOL ............................. [RUN]
14:46:53  3 of 3 PASS unique_my_first_dbt_model_SYMBOL ................................... [PASS in 0.04s]
14:46:53  
14:46:53  Finished running 3 tests in 0.28s.
14:46:53  
14:46:53  Completed successfully
14:46:53  
14:46:53  Done. PASS=3 WARN=0 ERROR=0 SKIP=0 TOTAL=3

PipeRiderの参入

pip install piperider

PipeRiderは自動的にdbtのdbt_project.ymlを読み込む

piperider init
Initialize piperider to path /example/path/to/qiita/sp500_index/.piperider
Start to search dbt project ...
Use dbt project file: /example/path/to/qiita/sp500_index/dbt_project.yml
─────────────────────────────────────────────────────────────────────────────────────────── .piperider/config.yml ────────────────────────────────────────────────────────────────────────────────────────────
   1 dataSources:
   2 - name: sp500_index
   3   type: sqlite
   4   dbt:
   5     profile: sp500_index
   6     target: dev
   7     projectDir: .
   8 telemetry:
   9   id: 181f0172e54546c99ea971d0a7b913e2

Next step:
  Please execute command 'piperider diagnose' to verify configuration

PipeRiderの設定を確認する

piperider diagnose
Diagnosing...
PipeRider Version: 0.4.1
Check config files:
  /example/path/to/qiita/sp500_index/.piperider/config.yml: [OK]
✅ PASS

Check format of data sources:
  sp500_index: [OK]
✅ PASS

Check connections:
  DBT: sqlite > sp500_index > dev
  Name: sp500_index
  Type: sqlite
  Available Tables: ['ACTION', 'PRICE', 'SYMBOL']
  Connection: [OK]
✅ PASS

Check assertion files:
  /example/path/to/qiita/sp500_index/.piperider/assertions/my_second_dbt_model.yml: [OK]
  /example/path/to/qiita/sp500_index/.piperider/assertions/my_first_dbt_model.yml: [OK]
✅ PASS

🎉 You are all set!


Next step:
  Please execute command 'piperider run' to generate your second report

dbtの生成されたデータモデルをプロファイングし、、dbtのアサーションとPipeRiderのアサーション実行し、結果によってリポートを生成す

piperider run --dbt-test

Output 範例

DataSource: sp500_index
──────────────────────────────────────────────────────────────────────────────────────────────────── Validating ────────────────────────────────────────────────────────────────────────────────────────────────────
everything is OK.
─────────────────────────────────────────────────────────────────────────────────────────────────── Running dbt ────────────────────────────────────────────────────────────────────────────────────────────────────
dbt working dir: .
──────────────────────────────────────────────────────────────────────────────────────────────────── Profiling ─────────────────────────────────────────────────────────────────────────────────────────────────────
fetching metadata for table 'my_first_dbt_model'
fetching metadata for table 'my_second_dbt_model'
profiling [my_first_dbt_model.SYMBOL] type=VARCHAR(16777216)
profiling [my_first_dbt_model.DATE] type=DATE
profiling [my_first_dbt_model.OPEN] type=NUMERIC(10, 2)
profiling [my_first_dbt_model.HIGH] type=NUMERIC(10, 2)
profiling [my_first_dbt_model.LOW] type=NUMERIC(10, 2)
profiling [my_first_dbt_model.CLOSE] type=NUMERIC(10, 2)
profiling [my_first_dbt_model.VOLUME] type=NUMERIC(38, 0)
profiling [my_first_dbt_model.ADJCLOSE] type=NUMERIC(10, 2)
profiling [my_first_dbt_model.MA5] type=NUMERIC(10, 2)
profiling [my_first_dbt_model.MA20] type=NUMERIC(10, 2)
profiling [my_first_dbt_model.MA60] type=NUMERIC(10, 2)
profiling [my_first_dbt_model.SYMBOL:1] type=VARCHAR(16777216)
profiling [my_first_dbt_model.NAME] type=VARCHAR(16777216)
profiling [my_first_dbt_model.START_DATE] type=DATE
profiling [my_first_dbt_model.END_DATE] type=DATE
profiling [my_first_dbt_model.DESCRIPTION] type=VARCHAR(16777216)
profiling [my_first_dbt_model.EXCHANGE_CODE] type=VARCHAR(16777216)
profiling [my_first_dbt_model.MARKET] type=VARCHAR(16777216)
profiling [my_first_dbt_model.COUNTRY] type=VARCHAR(16777216)
profiling [my_first_dbt_model.SECTOR] type=VARCHAR(16777216)
profiling [my_first_dbt_model.INDUSTRY] type=VARCHAR(16777216)
profiling [my_first_dbt_model.RECOMMENDATION_KEY] type=VARCHAR(16777216)
profiling [my_second_dbt_model.SYMBOL] type=VARCHAR(16777216)
profiling [my_second_dbt_model.DATE] type=DATE
profiling [my_second_dbt_model.OPEN] type=NUMERIC(10, 2)
profiling [my_second_dbt_model.HIGH] type=NUMERIC(10, 2)
profiling [my_second_dbt_model.LOW] type=NUMERIC(10, 2)
profiling [my_second_dbt_model.CLOSE] type=NUMERIC(10, 2)
profiling [my_second_dbt_model.VOLUME] type=NUMERIC(38, 0)
profiling [my_second_dbt_model.ADJCLOSE] type=NUMERIC(10, 2)
profiling [my_second_dbt_model.MA5] type=NUMERIC(10, 2)
profiling [my_second_dbt_model.MA20] type=NUMERIC(10, 2)
profiling [my_second_dbt_model.MA60] type=NUMERIC(10, 2)
profiling [my_second_dbt_model.NAME] type=VARCHAR(16777216)
profiling [my_second_dbt_model.COUNTRY] type=VARCHAR(16777216)
profiling [my_second_dbt_model.EXCHANGE_CODE] type=VARCHAR(16777216)
profiling [my_second_dbt_model.SECTOR] type=VARCHAR(16777216)
profiling [my_second_dbt_model.INDUSTRY] type=VARCHAR(16777216)

PipeRiderから自動生成のアサーションを作成するか?

noを入力してください、もし試したいからyesを入力してください

No assertion found
Do you want to auto generate recommended assertions for this datasource [Yes/no]? no

Output 範例

────────────────────────────────────────────────────────────────────────────────────────── Generating Assertion Templates ──────────────────────────────────────────────────────────────────────────────────────────
Template Assertion: /example/path/to/qiita/sp500_index/.piperider/assertions/my_second_dbt_model.yml
Template Assertion: /example/path/to/qiita/sp500_index/.piperider/assertions/my_first_dbt_model.yml
──────────────────────────────────────────────────────────────────────────────────────────────── Assertion Results ─────────────────────────────────────────────────────────────────────────────────────────────────
─────────────────────────────────────────────────────────────────────────────────────────────────────── dbt ────────────────────────────────────────────────────────────────────────────────────────────────────────
[  OK  ] my_first_dbt_model.SYMBOL  test.sp500_index.not_null_my_first_dbt_model_SYMBOL.06ca293b81  Message: None
[  OK  ] my_first_dbt_model.SYMBOL  test.sp500_index.unique_my_first_dbt_model_SYMBOL.0463684cd6    Message: None
[  OK  ] my_second_dbt_model.OPEN   test.sp500_index.not_null_my_second_dbt_model_OPEN.fa199efe22   Message: None
───────────────────────────────────────────────────────────────────────────────────────────────────── Summary ──────────────────────────────────────────────────────────────────────────────────────────────────────
Table 'my_second_dbt_model'
  16 columns profiled
  1 dbt test executed

Table 'my_first_dbt_model'
  22 columns profiled
  2 dbt test executed

Generating reports from: /example/path/to/qiita/sp500_index/.piperider/outputs/latest/run.json
Report generated in /example/path/to/qiita/sp500_index/.piperider/outputs/latest/index.html

Next step:
  Please execute command 'piperider run' to generate your second report

Output 範例 (ブラウザで/example/path/to/qiita/sp500_index/.piperider/outputs/latest/index.htmlリポートを見る)

Recording #3.gif

以上です

他の参考

0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1