More than 3 years have passed since last update.

[Treasure Data][Python] TD Clientを用いた、Treasure Data上にあるクエリの実行を行う

Posted at 2020-03-14

背景

GUIのTreasure Workflow上においては、

+run_query:
  td_run>: sample_query

な感じで、Treasure Data上のquery名でクエリの実行ができます。
しかし、Python Client側には、ばっちりこのままの機能は存在しないっぽい。(あるかもしれない)

結論

run_scheduleを使う

import time
import tdclient

if __name__ == '__main__':
    query_name = 'sample_query'
    with tdclient.Client(apikey='hogehoge') as td:
        # get unix time
        unix_time = int(time.time())
        run_time = unix_time + 10

        # set schedule query
        # run now!
        res = td.run_schedule(name=query_name, time=run_time, num=1)
    return 0

おまけ - 実行を待ちたい / 結果がほしい

import time
import tdclient

def main():
    query_name = 'sample_query'
    with tdclient.Client(apikey='hogehoge') as td:
        unix_time = int(time.time())
        run_time = unix_time + 10

        # set schedule query
        # run now!
        res = td.run_schedule(name=query_name, time=run_time, num=1)
        schedule_job = res[0]

        # get job_id from ScheduleJob object
        job_id = schedule_job._job_id

        # get Job object by job_id
        job = td.job(job_id=job_id)

        # wait until job finished
        job.wait()

        # get results one by one
        for row in job.result():
            print(repr(row))
    return 0


if __name__ == '__main__':
    main()

追記

なぜ、Treasure Data上のクエリを叩きたかったかというと、、、。
Treasure DataのGUI上でConnector設定したものを使いたかったから。
もちろん、Python上でExportの設定をすればいいのだけれど、横着したかった。

参考

公式 - Python Client
GitHub - Treasure Data API library for Python

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up