0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

Polars DataFrameからBigqueryにデータを投入する

Posted at

1.はじめに

  • PolarsのDataFrameからBigQueryに直接データを入れられる!
  • 今まで気づかなかったのでショック

2.問題

  • 従来、Polarsでデータを加工した後にBigQueryにデータをロードする際には、一度jsonlなどに出して、GCSにいれる、その後、jsonlを指定してBigQueryにロードというムーブをしていたが、ちょっと面倒。
  • pandasではやる方法があるので、Polarsでもあるのかと思い見ると、ある

2.コード

  • これだけ。dfはPolars DataFrame
  • colaboratoryからやる場合は、認証を通してから、destinationをPROJECT_ID.DATASET.TABLE_NAMEの3階層に。
#@title 最初1回実行(認証) 
from google.colab import auth 
auth.authenticate_user()
#@title polars to BQ
from google.cloud import bigquery
import io

client = bigquery.Client()

# Write DataFrame to stream as parquet file; does not hit disk
with io.BytesIO() as stream:
    df.write_parquet(stream)
    stream.seek(0)
    parquet_options = bigquery.ParquetOptions()
    parquet_options.enable_list_inference = True
    job = client.load_table_from_file(
        stream,
        destination='PROJECT_ID.DATASET.TABLE_NAME',
        project='PROJECT_ID',
        job_config=bigquery.LoadJobConfig(
            source_format=bigquery.SourceFormat.PARQUET,
            parquet_options=parquet_options,
            write_disposition=bigquery.WriteDisposition.WRITE_APPEND,
        ),
    )
job.result()  # Waits for the job to complete

3.注意点

  • カラム名が日本語だと対応してくれないので、renameにしておく。
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?