Polars DataFrameからBigqueryにデータを投入する

Posted at 2025-06-05

1.はじめに

PolarsのDataFrameからBigQueryに直接データを入れられる！
今まで気づかなかったのでショック

2.問題

従来、Polarsでデータを加工した後にBigQueryにデータをロードする際には、一度jsonlなどに出して、GCSにいれる、その後、jsonlを指定してBigQueryにロードというムーブをしていたが、ちょっと面倒。
pandasではやる方法があるので、Polarsでもあるのかと思い見ると、ある！

2.コード

これだけ。dfはPolars DataFrame
colaboratoryからやる場合は、認証を通してから、destinationをPROJECT_ID.DATASET.TABLE_NAMEの３階層に。

#@title 最初１回実行（認証） 
from google.colab import auth 
auth.authenticate_user()

#@title polars to BQ
from google.cloud import bigquery
import io

client = bigquery.Client()

# Write DataFrame to stream as parquet file; does not hit disk
with io.BytesIO() as stream:
    df.write_parquet(stream)
    stream.seek(0)
    parquet_options = bigquery.ParquetOptions()
    parquet_options.enable_list_inference = True
    job = client.load_table_from_file(
        stream,
        destination='PROJECT_ID.DATASET.TABLE_NAME',
        project='PROJECT_ID',
        job_config=bigquery.LoadJobConfig(
            source_format=bigquery.SourceFormat.PARQUET,
            parquet_options=parquet_options,
            write_disposition=bigquery.WriteDisposition.WRITE_APPEND,
        ),
    )
job.result()  # Waits for the job to complete

3.注意点

カラム名が日本語だと対応してくれないので、renameにしておく。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up