More than 5 years have passed since last update.

ドシロウトがGoogle Colaboratoryをさわってみた

Last updated at 2018-03-14Posted at 2018-03-14

０．初めに

私はエンジニアではないただのドシロウトです。

Googleが機械学習の教育や研究用に提供しているGoogle Colaboratoryを試してみました。

以下の記事で詳しく説明されています。

Google Colaboratoryを使えばブラウザ上でPythonの実行環境が簡単に手に入る
http://tadaken3.hatenablog.jp/entry/first-step-colabratory

よくわからないのでとりあえずコードスニペットをほぼパクって動いた物を自分用にまとめました。以下スニペットになります。

1.ファイルを生成してローカルにダウンロード

生成したファイルは仮想マシンのローカルに作成される
仮想マシンなので90分未操作だとファイルも含め、消える

　90分の元ネタ→　
　【秒速で無料GPUを使う】TensorfFow/Keras/PyTorch/Chainer環境構築 on Colaboratory
　https://qiita.com/tomo_makes/items/f70fe48c428d3a61e131

多分一時ファイル的な使い方だと思われる
ダウンロード先はローカルなのでファイルを取得できる

c001.py

from google.colab import files

with open('example.txt', 'w') as f:
  f.write('some content')

files.download('example.txt')

2.CSVの読み書き

生成したCSVは仮想マシンのローカルに作成される
仮想マシンなので90分未操作だとCSVも含め、消える
多分一時ファイル的な使い方だと思われる
注意するのは改行コード程度

c002.py

from google.colab import files

with open('example.csv', 'w') as z:
  z.write('\
都道府県,平成22年,平成27年,平成28年\n\
東京都,13159 ,13515 ,13624\n\
神奈川県,9048 ,9126 ,9145\n\
大阪府,8865 ,8839 ,8833\n\
愛知県,7411 ,7483 ,7507\n\
千葉県,7195 ,7267 ,7289\
')

with open('example.csv', 'r') as f:
  aaa = f.read()
  
print(aaa)

3.CSVをPandasで表示

仮想マシン上のCSVを表示

c003.py

import pandas as pd
pd.read_csv("example.csv", encoding="UTF-8")

4.Google スプレッドシートライブラリのアップグレード（コマンド）

コマンドもGoogle Colaroratory上で利用できる

!pip install --upgrade -q gspread

5.Google スプレッドシートを読込Pandasで表示

スプレッドシート自体はGoogle Driveに存在

c004.py

from google.colab import auth
auth.authenticate_user()

import gspread
from oauth2client.client import GoogleCredentials

gc = gspread.authorize(GoogleCredentials.get_application_default())

worksheet = gc.open('パラメータデータベース').sheet1

# get_all_values gives a list of rows.
rows = worksheet.get_all_values()
print(rows)

# Convert to a DataFrame and render.
import pandas as pd
pd.DataFrame.from_records(rows)

6.スプレッドシートを新規につくってデータを書き込む

スプレッドシートはGoogle Driveに作成される
gc.createで同名のファイルを２つ作ると別ファイルとなる

c005.py

from google.colab import auth
auth.authenticate_user()

import gspread
from oauth2client.client import GoogleCredentials

gc = gspread.authorize(GoogleCredentials.get_application_default())

sh = gc.create('Colaboratoryテスト1')

# Open our new sheet and add some data.
worksheet = gc.open('Colaboratoryテスト1').sheet1

cell_list = worksheet.range('A1:C2')

import random
for cell in cell_list:
  cell.value = random.randint(1, 10)
  print(cell,cell.value)

worksheet.update_cells(cell_list)
# Go to https://sheets.google.com to see your new spreadsheet.

7.既存のスプレッドシートのデータを更新する

6.からgc.createをコメントアウトしただけ

c006.py

from google.colab import auth
auth.authenticate_user()

import gspread
from oauth2client.client import GoogleCredentials

gc = gspread.authorize(GoogleCredentials.get_application_default())

# sh = gc.create('Colaboratoryテスト1')

# Open our new sheet and add some data.
worksheet = gc.open('Colaboratoryテスト1').sheet1

cell_list = worksheet.range('A1:C2')

import random
for cell in cell_list:
  cell.value = random.randint(1, 10)
  print(cell,cell.value)

worksheet.update_cells(cell_list)
# Go to https://sheets.google.com to see your new spreadsheet.

8.既存のスプレッドシートを読み込んでPandasで表示

コメントは特になし

c007.py

from google.colab import auth
auth.authenticate_user()

import gspread
from oauth2client.client import GoogleCredentials

gc = gspread.authorize(GoogleCredentials.get_application_default())

worksheet = gc.open('Colaboratoryテスト1').sheet1

# get_all_values gives a list of rows.
rows = worksheet.get_all_values()
print(rows)

# Convert to a DataFrame and render.
import pandas as pd
pd.DataFrame.from_records(rows)

9.OSのバージョン確認（コマンド）

実行時はUbuntu 17.10でした。

!cat /etc/issue

また、以下のコマンド類も普通につかえました。

!ls -l
!pwd 
!cat example.csv

10.Googleドライブのルート直下のテキストファイルを表示

スニペットのまんまです。

c008.py

# Install the PyDrive wrapper & import libraries.
# This only needs to be done once per notebook.
!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# Authenticate and create the PyDrive client.
# This only needs to be done once per notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# List .txt files in the root.
#
# Search query reference:
# https://developers.google.com/drive/v2/web/search-parameters
listed = drive.ListFile({'q': "title contains '.txt' and 'root' in parents"}).GetList()
for file in listed:
  print('title {}, id {}'.format(file['title'], file['id']))

11.Google Driveに新規でテキストファイルを作る

Googleドライブに作られるのでファイルが消えません。
ファイルIDがファイル毎に振られる

c009.py

# Create & upload a text file.
uploaded = drive.CreateFile({'title': 'testcolab002.txt'})
uploaded.SetContentString('\
テストテストテスト\n\
\n\
Google Colaroratoryのテストです\n\
\n\
以上\n\
\n\
')
uploaded.Upload()
print('Uploaded file with ID {}'.format(uploaded.get('id')))

12.Googleドライブのテキストファイルの内容表示

downloadedだけだとID情報
downloaded.GetContentString()でファイルの中身

c010.py

# Download a file based on its file ID.
#
# A file ID looks like: laggVyWshwcyP6kEI-y_W3P8D26sz
file_id = '(ファイルID)'
downloaded = drive.CreateFile({'id': file_id})
print(downloaded)
print('------------')
print(downloaded.GetContentString())
print('------------')
print('Downloaded content "{}"'.format(downloaded.GetContentString()))

以上です。なかなか便利かも。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up