ちょっと面倒だったのでメモを残しておく。雑なスクリプトですが。
実装
ダウンロードするための API が用意されていないのが面倒いポイント。
authorize まわりの処理は、ほぼ Ruby QuickStart ママなので説明省略。
download_google_spreadsheets.rb
require 'csv'
require 'active_support'
require 'googleauth'
require 'googleauth/stores/file_token_store'
require 'google/apis/sheets_v4'
require 'google/apis/drive_v3'
require 'open-uri'
SPREADSHEET_ID = 'XXXXXXXXXXXXXXXXX'
TOKEN_PATH = "#{__dir__}/token.yml"
CREDENTIALS_PATH = "#{__dir__}/credentials.json"
OOB_URI = 'urn:ietf:wg:oauth:2.0:oob'
SCOPES = [
Google::Apis::SheetsV4::AUTH_SPREADSHEETS_READONLY,
Google::Apis::DriveV3::AUTH_DRIVE # NOTE: スプレッドシートをダウンロードするのに必要
]
def authorize
client_id = Google::Auth::ClientId.from_file(CREDENTIALS_PATH)
token_store = Google::Auth::Stores::FileTokenStore.new(file: TOKEN_PATH)
authorizer = Google::Auth::UserAuthorizer.new(client_id, SCOPES, token_store)
user_id = 'default'
credentials = authorizer.get_credentials(user_id)
if credentials.nil?
url = authorizer.get_authorization_url(base_url: OOB_URI)
puts 'Open the following URL in the browser and enter the ' \
"resulting code after authorization:\n" + url
code = gets
credentials = authorizer.get_and_store_credentials_from_code(
user_id: user_id, code: code, base_url: OOB_URI
)
end
credentials
end
# ココが今回の話の焦点
def fetch_csv_files_as_hashes
sheet_service = Google::Apis::SheetsV4::SheetsService.new
sheet_service.client_options.application_name = 'sensuikan1973 sample app'
authorization = authorize
sheet_service.authorization = authorization
spreadsheet = JSON.parse(sheet_service.get_spreadsheet(SPREADSHEET_ID).to_json)
# 自前で export 用の URL を組み立てる
export_url_list = spreadsheet['sheets'].map do |sheet|
"https://docs.google.com/spreadsheets/d/#{SPREADSHEET_ID}/export?format=csv&gid=#{sheet['properties']['sheetId']}"
end.compact!
export_url_list.map do |export_url|
puts "download csv from #{export_url}"
# 間隔なしで連続で叩くと 429 で怒られちゃう
sleep(5)
uri = OpenURI.open_uri(
export_url,
'rb:utf-8',
'Authorization' => "Bearer #{authorization.access_token}" # NOTE: ここ大事!
)
CSV.parse(uri.read, headers: true).map(&:to_hash) # hash にして返す
end
end
参考
- V4 SpreadSheet get : 実際に API を実行してレスポンスを確かめられて便利
- stack overflow | Using the google drive API to download a spreadsheet in csv format : Python で似たようなことをやっていて、とても参考になった