4
6

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

無料版Google Colabで、Kaggleの大容量データを扱う方法

Posted at

やりたいこと

  • 無料版Google Colabで、Kaggleコンペの60GBのデータを扱いたい

参考にしたサイト

. Google Colab上からKaggleのデータセットに直接アクセスする方法
https://slash-z.com/google-colab-mount-kaggle-competition-dataset/#toc2

途中で失敗した内容

①KaggleDatasets().get_gcs_path() で時間切れ
→2回目を実行したらうまくいった。

②gcsfuseコマンドのインストールで502エラー発生
→下記回避策

# 下記が502エラーでうまくいかない
!apt-get -y -q install gcsfuse

# 回避策
!apt-get update -y
!apt-get install -y fuse
!curl -L -O "https://github.com/GoogleCloudPlatform/gcsfuse/releases/download/v1.2.0/gcsfuse_1.2.0_amd64.deb"
!dpkg -i gcsfuse_1.2.0_amd64.deb

最終的なデータimport文

# データインポート
from google.colab import auth
auth.authenticate_user()
!echo "deb http://packages.cloud.google.com/apt gcsfuse-`lsb_release -c -s` main" | sudo tee /etc/apt/sources.list.d/gcsfuse.list
!curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
!apt-get -y -q update
!apt-get update -y
!apt-get install -y fuse
!curl -L -O "https://github.com/GoogleCloudPlatform/gcsfuse/releases/download/v1.2.0/gcsfuse_1.2.0_amd64.deb"
!dpkg -i gcsfuse_1.2.0_amd64.deb
!mkdir -p tmpDir
!gcsfuse --implicit-dirs --limit-bytes-per-sec -1 --limit-ops-per-sec -1 'kds-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' tmpDir
4
6
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
4
6

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?