ポイント
- ファイルダウンロードはせず IO から string を読み取る
- IO や String をそのまま取り扱って UTF-8 の状態で Ruby の CSV class に読ませる
- あとはこちらのものだ
- 途中のstringやioの取り扱いでものすごく無駄なことをやっている可能性もある
- Ruby 2.5の罠に注意 ( #Ruby 2.5 では 1行のBodyだけのCSV ファイルで headers が取得できないっぽい? ( Ruby 2.6 では可能 ) - Qiita https://qiita.com/YumaInaura/items/56d42540c43523682abf )
確認用ファイル
"ヘッダ1","ヘッダ2","ヘッダ3"
"1234","わー","56789"
"5678","わー","56780"
S3
バケットのトップレベルに sjis.csv があるものとする
Code
require 'aws-sdk-s3'
require 'csv'
credentials = Aws::Credentials.new(ENV['AWS_ACCESS_KEY_ID'], ENV['AWS_SECRET_ACCESS_KEY'])
s3_resource = Aws::S3::Resource::new(region: ENV['AWS_REGION'], credentials: credentials) # e.g reagion: 'ap-northeast-1'
s3_bucket = s3_resource.bucket(ENV['AWS_BUCKET']) # e.g bucket name : yourname : in this case "yumainaura"
# If bucket does not exists then create it
# s3_bucket.create
#
# Already exists error is it
# Aws::S3::Errors::BucketAlreadyOwnedByYou: Your previous request to create the named bucket succeeded and you already own it.
# For preparing upload local to storage
s3_bucket.object('sjis.csv').upload_file('sjis.csv')
# Do not specify top level slash
storage_object = s3_bucket.object('sjis.csv')
io = storage_object.get.body
file_body = io.read
csv_utf8_strings = file_body.encode('UTF-8', 'Shift_JIS')
csv_string_io = CSV.new(csv_utf8_strings, headers: true)
csv_string_io.each do |line|
puts line[0]
puts line[1]
puts line[2]
end
Reulst
storage_object
# => #<Aws::S3::Object:0x00007fc9a82fb0b0
# @bucket_name="yumainaura",
# @client=#<Aws::S3::Client>,
# @data=nil,
# @key="sjis.csv",
# @waiter_block_warned=false>
storage_object_got = storage_object.get
# <struct Aws::S3::Types::GetObjectOutput
# body=#<StringIO:0x00007fc9ac0a98d0>,
# delete_marker=nil,
# accept_ranges="bytes",
# expiration=nil,
# restore=nil,
# last_modified=2020-01-05 03:09:31 +0000,
# content_length=73,
# etag="\"cdd7902d140737b497bdccae379ea93c\"",
# missing_meta=nil,
# version_id=nil,
# cache_control=nil,
# content_disposition=nil,
# content_encoding=nil,
# content_language=nil,
# content_range=nil,
# content_type="",
# expires=nil,
# expires_string=nil,
# website_redirect_location=nil,
# server_side_encryption=nil,
# metadata={},
# sse_customer_algorithm=nil,
# sse_customer_key_md5=nil,
# ssekms_key_id=nil,
# storage_class=nil,
# request_charged=nil,
# replication_status=nil,
# parts_count=nil,
# tag_count=nil,
# object_lock_mode=nil,
# object_lock_retain_until_date=nil,
# object_lock_legal_hold_status=nil>
io = storage_object_got.body
# => #<StringIO:0x00007fc9ac0a98d0>
file_body = io.read
# => "\"\x83w\x83b\x83_1\",\"\x83w\x83b\x83_2\",\"\x83w\x83b\x83_3\"\n\"1234\",\"\x82\xED\x81[\",\"56789\"\n\"5678\",\"\x82\xED\x81[\",\"56780\""
csv_utf8_strings = file_body.encode('UTF-8', 'Shift_JIS')
# => "\"ヘッダ1\",\"ヘッダ2\",\"ヘッダ3\"\n\"1234\",\"わー\",\"56789\"\n\"5678\",\"わー\",\"56780\""
csv_string_io = CSV.new(csv_utf8_strings, headers: true)
# => <#CSV io_type:StringIO encoding:UTF-8 lineno:0 col_sep:"," row_sep:"\n" quote_char:"\"" headers:true>
csv_string_io.each do |line|
puts line[0]
puts line[1]
puts line[2]
end
# 1234
# わー
# 56789
# 5678
# わー
# 56780
Original by Github issue
チャットメンバー募集
何か質問、悩み事、相談などあればLINEオープンチャットもご利用ください。