Qiita Teams that are logged in
You are not logged in to any team

Log in to Qiita Team
Community
OrganizationAdvent CalendarQiitadon (β)
Service
Qiita JobsQiita ZineQiita Blog
Help us understand the problem. What is going on with this article?

S3 Bucket に含まれる key 一覧をファイルに出力する

More than 5 years have passed since last update.

仕事で S3 を利用するようになった。

S3 の Key は DB に保存しており通常は問題ないが、どこかでずれるとやっかい。

そこで boto を使って、 S3 の Key 一覧をとってみた。

HEAD しか投げてなさそうなので、効率は良いと思います。

#! /usr/bin/env python
# -*- coding: utf-8 -*-
"""
対象のバケットに含まれるファイル一覧を TSV に出力する。
"""
import sys
import os
import csv
from ConfigParser import SafeConfigParser
from getpass import getpass

from boto import connect_s3


AWS_CLI_CONFIG_PATH = os.path.expanduser('~/.aws/config')


def get_aws_config(config_path=AWS_CLI_CONFIG_PATH):
    """
    aws cli の config から以下のキーを返す
    - aws_access_key_id
    - aws_secret_access_key'
    """
    keys = ['aws_access_key_id', 'aws_secret_access_key']
    cfg = SafeConfigParser()
    with open(config_path, 'r') as fp:
        cfg.readfp(fp)
    return tuple(cfg.get('default', x) for x in keys)


def get_bucket(aws_access_key_id, aws_secret_access_key, bucket_name):
    """
    boto S3 bucket を返す
    """
    if not aws_access_key_id and not aws_secret_access_key:
        aws_access_key_id, aws_secret_access_key = get_aws_config()
    return connect_s3(aws_access_key_id, aws_secret_access_key).get_bucket(bucket_name)


def write_tsv(aws_access_key_id, aws_secret_access_key, bucket_name, file_name):
    """
    S3 bucket の key.name 一覧を file_name に TSV で書き出す。
    """
    # 絶対ファイルパスの決定
    file_path = os.path.abspath(file_name)

    def _writerows(rows):
        with open(file_path, 'a') as fp:
            writer = csv.writer(fp, dialect='excel-tab')
            writer.writerows(rows)

    # header の書き出し
    _writerows([('key_name', )])

    # body の書き出し
    rows = []
    for key in get_bucket(aws_access_key_id, aws_secret_access_key, bucket_name).list():
        rows.append(key.name)
        if len(rows) > 1000:
            _writerows(rows)
            rows = []
    else:
        _writerows(rows)


if __name__ == '__main__':
    if len(sys.argv) != 2:
        print('Please specify output filename.')

    else:
        print('Please input the aws_access_key_id/aws_secret_access_key and a target bucket name.')
        print('If you don\'t input the aws_access_key_id/aws_secret_access_key, then we use awscli config.')
        aws_access_key_id = getpass('aws_access_key_id: ')
        aws_secret_access_key = getpass('aws_secret_access_key: ')
        bucket_name = raw_input('target bucket name: ')

        if not aws_access_key_id and not aws_secret_access_key and not os.path.isfile(AWS_CLI_CONFIG_PATH):
            print('Please specify the aws_access_key_id/aws_secret_access_key or create awscli config.')
            sys.exit(1)

        write_tsv(
            aws_access_key_id,
            aws_secret_access_key,
            bucket_name,
            sys.argv[1])
        print('Output: {}'.format(sys.argv[1]))

すてま 僕が所属している会社社員募集中みたいです。
なんか、 Python 書いてみてーなと思った人は応募してみてくださいな。

tomoh1r
Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away