Vision AI を試してみた (Python)

Last updated at 2025-01-26Posted at 2025-01-26

Google Cloud Vision AI

リンク

概要

Google Cloud Vision AI は、画像や動画から情報を抽出・分析する Google Cloud のサービスです。画像認識、顔検出、OCR などの機能を活用し、様々な画像処理タスクを自動化できます。

利用可能な機能

ラベル付け: 画像内のオブジェクトやシーンを自動的にラベル付け
顔検出: 画像から顔を検出し、感情や年齢、性別などを推定
光学文字認識 (OCR): 画像やドキュメント内のテキストを抽出
ロゴ検出: 画像からロゴを検出
ランドマーク検出: 画像内の有名な場所を特定
不適切なコンテンツの検出: 画像に含まれる不適切なコンテンツを検出

セットアップ

Google Cloud Platform プロジェクトの作成:
Google Cloud Platform Console で新しいプロジェクトを作成します。
Vision API の有効化:
作成したプロジェクトで、Vision API を有効にします。
サービスアカウントの作成:
Vision API を利用するためのサービスアカウントを作成し、JSON キーをダウンロードします。
環境変数の設定:
ダウンロードした JSON キーのパスを、環境変数 GOOGLE_APPLICATION_CREDENTIALS に設定します。
```
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/key.json"
```

実行

pipenv run python vision.py
用意した画像

pythonコード

vision.py

import os
from google.cloud import vision

def detect_text(path):
    """Detects text in the file."""
    client = vision.ImageAnnotatorClient()

    with open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.Image(content=content)

    response = client.text_detection(image=image)
    texts = response.text_annotations
    print('Texts:')

    for text in texts:
        print('\n"{}"'.format(text.description))

        vertices = (['({},{})'.format(vertex.x, vertex.y)
                     for vertex in text.bounding_poly.vertices])

        print('bounds: {}'.format(','.join(vertices)))

if __name__ == '__main__':
    print(os.getenv("GOOGLE_APPLICATION_CREDENTIALS")) # jsonファイルのパス
    detect_text('images.jpeg')

環境

Pipfile

[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
google-cloud-vision = "*"

[dev-packages]

[requires]
python_version = "3.10"

結果

Loading .env environment variables...
/Users/taketo/Downloads/naturallanguageapipractice-1372273acbf4.json
Texts:

"歌舞伎町一番街"
bounds: (105,32),(194,32),(194,45),(105,45)

"歌舞伎町"
bounds: (105,32),(156,32),(156,45),(105,45)

"一番"
bounds: (157,32),(182,32),(182,44),(157,44)

"街"
bounds: (182,32),(194,32),(194,44),(182,44)
taketo@MacBook-Pro-4 vision-api %

文字が認識できており、バウンディングボックスの座標が表示されています。

まとめ

Vision AIを利用して、文字検出を行いました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up