SageMakerからGoogleCloudVisionに投げてみる

Last updated at 2020-09-03Posted at 2020-08-06

簡単に

AWSのSageMakerを使用している際に、GoogleのCloud Visionに投げてみようと思いたち、OCRを実行した際のメモ書きです。

環境

SageMakerのconda_python3カーネルを使用
google-cloud-vision: 1.0.0（初期で入っている（？））

内容

認証

認証に関しては、クリアしている前提とします。

OCR

チュートリアルをそのまま持ってきます。テスト用画像としてテキトーな画像を用意しておきます。

画像からの文字読み取り用のfunctionとして以下のような関数が例として出されています。

def detect_text(path):
    """Detects text in the file."""
    from google.cloud import vision
    import io
    client = vision.ImageAnnotatorClient()

    with io.open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.types.Image(content=content)

    response = client.text_detection(image=image)
    texts = response.text_annotations
    print('Texts:')

    for text in texts:
        print('\n"{}"'.format(text.description))

        vertices = (['({},{})'.format(vertex.x, vertex.y)
                    for vertex in text.bounding_poly.vertices])

        print('bounds: {}'.format(','.join(vertices)))

    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(
                response.error.message))

SageMakerのNotebook上で実行してみましょう。そうすると以下のように認証情報がセットされてませんよというエラーになります。

DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started

なので、落としてきた認証用のJSONを渡してあげます。

- client = vision.ImageAnnotatorClient()
+ from google.oauth2 import service_account
+ credentials = service_account.Credentials.from_service_account_file('credentials.json')
+ client = vision.ImageAnnotatorClient(credentials=credentials)

無事に「夏」が読み取られたようです。最近暑いですね。。

S3から

これだけだとなんなので、S3にある画像をCloud Vision APIに投げつけてみましょう。S3へのアクセス権限の付与は別途必要になります。

def detect_text_from_s3(bucket, key):
    """Detects text in the file."""
    from google.cloud import vision
    import io
    import boto3
    from google.oauth2 import service_account

    credentials = service_account.Credentials.from_service_account_file('credentials.json')
    client = vision.ImageAnnotatorClient(credentials=credentials)    

    s3 = boto3.client('s3')
    content = s3.get_object(Bucket=bucket, Key=key)['Body'].read()

    image = vision.types.Image(content=content)

    response = client.text_detection(image=image)
    texts = response.text_annotations
    print('Texts:')

    for text in texts:
        print('\n"{}"'.format(text.description))

        vertices = (['({},{})'.format(vertex.x, vertex.y)
                    for vertex in text.bounding_poly.vertices])

        print('bounds: {}'.format(','.join(vertices)))

    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(
                response.error.message))

- with io.open(path, 'rb') as image_file:
-      content = image_file.read()
+ boto3
+ s3 = boto3.client('s3')
+ content = s3.get_object(Bucket=bucket, Key=key)['Body'].read()

無事に読み取れました。

まとめ

今回はSageMakerからGoogleのCloud Vision APIにリクエストを投げて、OCRを実行してみました。AWSのサービスを使いなよ、という気もしますが色々使い分けられても良いかなと思います。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up