dbtプロジェクトの作成から、データウェアハウスにテーブルを作成するまでの手順をまとめました。
dbtプロジェクトを作成するにはinitコマンドを使います。
実行後はデータウェアハウスの接続情報を聞かれるので準備しておくと良いでしょう。
Dbt# dbt init
データウェアハウスの接続情報はdbt/profiles.ymlに記述されます。
.dbt/profiles.yml
Dbt/snowflake# cat ~/.dbt/profiles.yml
snowflake:
outputs:
dev:
account: '123'
database: database
password: password
role: dev
schema: schema
threads: 1
type: snowflake
user: user
warehouse: warehouse
target: dev
snowflake dbtプロジェクトのディレクトリ構造です。
Dbt/snowflake# tree ./
./
|-- README.md
|-- analyses
|-- dbt_project.yml
|-- macros
|-- models
| `-- example
| |-- my_first_dbt_model.sql
| |-- my_second_dbt_model.sql
| `-- schema.yml
|-- seeds
|-- snapshots
`-- tests
7 directories, 5 files
ここで重要なファイルはdbt_project.ymlです。
dbtプロジェクトの設定が書かれています。
dbt_project.yml
# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'snowflake_test'
version: '1.0.0'
config-version: 2
# This setting configures which "profile" dbt uses for this project.
profile: 'snowflake'
# These configurations specify where dbt should look for different types of files.
# The `model-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]
clean-targets: # directories to be removed by `dbt clean`
- "target"
- "dbt_packages"
# Configuring models
# Full documentation: https://docs.getdbt.com/docs/configuring-models
# In this example config, we tell dbt to build all models in the example/
# directory as views. These settings can be overridden in the individual model
# files using the `{{ config(...) }}` macro.
# この記述のためexample/はデフォルトviewで作成される
models:
snowflake:
# Config indicated by + and applies to all files under models/example/
example:
+materialized: view
データウェアハウスにテーブルを作成するにはschema.ymlとSQLファイルを用意します。
デフォルトのschama.ymlをSQLファイルを見てみましょう。
my_first_dbt_model.sql
/*
Welcome to your first dbt model!
Did you know that you can also configure models directly within SQL files?
This will override configurations stated in dbt_project.yml
Try changing "table" to "view" below
*/
-- dbt runが実行されたときに物理テーブルとしてデータベースが作成される
{{ config(materialized='table') }}
with source_data as (
select 1 as id
union all
select null as id
)
-- テーブル名はSQLファイル名
select *
from source_data
/*
Uncomment the line below to remove records with null `id` values
*/
-- where id is not null
my_second_dbt_model.sql
-- Use the `ref` function to select from other models
select *
from {{ ref('my_first_dbt_model') }}
where id = 1
schema.yml
version: 2
models:
- name: my_first_dbt_model
description: "A starter dbt model"
columns:
- name: id
description: "The primary key for this table"
tests:
- unique
- not_null
- name: my_second_dbt_model
description: "A starter dbt model"
columns:
- name: id
description: "The primary key for this table"
tests:
- unique
- not_null
schema.ymlの情報からテーブルを作成し、SQLファイルを元にデータを入れるのが分かりますね。
テーブルを作成するにはdbt runを実行します。
Dbt/snowflake# dbt run
10:37:07 Running with dbt=1.7.9
10:37:08 Registered adapter: snowflake=1.7.2
10:37:08 Unable to do partial parsing because a project config has changed
10:37:10 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.snowflake.example
10:37:10 Found 2 models, 4 tests, 0 sources, 0 exposures, 0 metrics, 430 macros, 0 groups, 0 semantic models
10:37:10
10:37:13 Concurrency: 1 threads (target='dev')
10:37:13
10:37:13 1 of 2 START sql table model SNOWFLAKE_TEST.my_first_dbt_model ................. [RUN]
10:37:15 1 of 2 OK created sql table model SNOWFLAKE_TEST.my_first_dbt_model ............ [SUCCESS 1 in 2.26s]
10:37:15 2 of 2 START sql view model SNOWFLAKE_TEST.my_second_dbt_model ................. [RUN]
10:37:16 2 of 2 OK created sql view model SNOWFLAKE_TEST.my_second_dbt_model ............ [SUCCESS 1 in 1.39s]
10:37:16
10:37:16 Finished running 1 table model, 1 view model in 0 hours 0 minutes and 6.38 seconds (6.38s).
10:37:16
10:37:16 Completed successfully
10:37:16
10:37:16 Done. PASS=2 WARN=0 ERROR=0 SKIP=0 TOTAL=2
Snowflakeを確認するとテーブルが作成されていますね。
以上でdbtを使ってテーブルを作成できました。