LoginSignup
8
9

More than 5 years have passed since last update.

Treasure Data Toolbeltを使ってみる(Mac)

Posted at

tdコマンドを使えるようにする

Rubyのバージョンを確認する

ターミナル
$ ruby --version
ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-darwin16]

tdコマンドのインストール (ruby gem)

ターミナル
$ gem install td

コマンドの存在確認

ターミナル
$ which td
/Users/hoge/.rbenv/shims/td

バージョン確認

ターミナル
$ td --version
0.15.8

バージョンアップ

ターミナル
$ gem update td

アカウント設定(Google SSO Users)

ターミナル
$ td apikey:set <your_apikey>

アカウント確認

ターミナル
$ less ~/.td/td.conf 
[account]
  apikey = **********************************************

コマンドのヘルプ情報

tdコマンド

ターミナル
$ td
usage: td [options] COMMAND [args]

options:
  -c, --config PATH                path to the configuration file (default: ~/.td/td.conf)
  -k, --apikey KEY                 use this API key instead of reading the config file
  -e, --endpoint API_SERVER        specify the URL for API server to use (default: https://api.treasuredata.com).
                                     The URL must contain a scheme (http:// or https:// prefix) to be valid.
                                     Valid IPv4 addresses are accepted as well in place of the host name.
      --insecure                   Insecure access: disable SSL (enabled by default)
  -v, --verbose                    verbose mode
  -h, --help                       show help
  -r, --retry-post-requests        retry on failed post requests.
                                   Warning: can cause resource duplication, such as duplicated job submissions.
      --version                    show version

Basic commands:

  db             # create/delete/list databases
  table          # create/delete/list/import/export/tail tables
  query          # issue a query
  job            # show/kill/list jobs
  import         # manage bulk import sessions (Java based fast processing)
  bulk_import    # manage bulk import sessions (Old Ruby-based implementation)
  result         # create/delete/list result URLs
  sched          # create/delete/list schedules that run a query periodically
  schema         # create/delete/modify schemas of tables
  connector      # manage connectors
  workflow       # manage workflows

Additional commands:

  status         # show scheds, jobs, tables and results
  apikey         # show/set API key
  server         # show status of the Treasure Data server
  sample         # create a sample log file
  help           # show help messages

td queryコマンド

ターミナル
$ td query --help
usage:
  $ td query [sql]

example:
  $ td query -d example_db -w -r rset1 "select count(*) from table1"
  $ td query -d example_db -w -r rset1 -q query.txt

description:
  Issue a query

options:
  -d, --database DB_NAME           use the database (required)
  -w, --wait[=SECONDS]             wait for finishing the job (for seconds)
  -G, --vertical                   use vertical table to show results
  -o, --output PATH                write result to the file
  -f, --format FORMAT              format of the result to write to the file (tsv, csv, json, msgpack, and msgpack.gz)
  -r, --result RESULT_URL          write result to the URL (see also result:create subcommand)
                                    It is suggested for this option to be used with the -x / --exclude option to suppress printing
                                    of the query result to stdout or -o / --output to dump the query result into a file.
  -u, --user NAME                  set user name for the result URL
  -p, --password                   ask password for the result URL
  -P, --priority PRIORITY          set priority
  -R, --retry COUNT                automatic retrying count
  -q, --query PATH                 use file instead of inline query
  -T, --type TYPE                  set query type (hive, presto)
      --sampling DENOMINATOR       OBSOLETE - enable random sampling to reduce records 1/DENOMINATOR
  -l, --limit ROWS                 limit the number of result rows shown when not outputting to file
  -c, --column-header              output of the columns' header when the schema is available for the table (only applies to json, tsv and csv formats)
  -x, --exclude                    do not automatically retrieve the job result
  -O, --pool-name NAME             specify resource pool by name
      --domain-key DOMAIN_KEY      optional user-provided unique ID. You can include this ID with your `create` request to ensure idempotence

コマンドからクエリを実行する

hogeデータベースのhugaテーブルのselectを実行する。

$ td query -d qa_hdsp -T presto "SELECT * FROM hoge.huga LIMIT 10"

Job 23437**** is queued.
Use 'td job:show 23437****' to show the status.

クエリ実行(任意のAPI KEYを指定して実行する)

hogeデータベースのhugaテーブルのselectを実行する。

ターミナル
$ td -k ********************** query -w -t hive -d hoge -q hoge.huga.sql
hoge.huga.sql
SELECT time FROM hoge.huga LIMIT 10;

※ 「**********************」のところに任意のAPI KEYを指定する。

参考サイト


8
9
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
8
9