More than 1 year has passed since last update.

【乗り遅れないで！】学習なしで画像を任意のクラスに分類できるAI

Posted at 2023-07-07

学習なしで、なんでも分類できる

このモデルを使えば学習なし分類できます。
過去には画像を分類する場合、分類したいクラスを設定して、クラスごとの画像を集めてデータセットを作って学習して。。。という作業が必要な時代がありましたが、必要ありません。
また、クラスの特徴が顕著でない微妙な画像に関しても、どのクラスが一番近いか分類できます。

画像とテキストをemmbeddingとして扱うモデルを使います。
emmbedingにしてしまえば、差分を計算するだけでどのクラスに一番似ているかわかるので、任意のクラスに分類できるという仕組みです。

手順

インストール

pip install salesforce-lavis

モデルの初期化

import torch
from PIL import Image

from lavis.models import load_model_and_preprocess

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model, vis_processors, _ = load_model_and_preprocess("blip_feature_extractor", model_type="base", is_eval=True, device=device)

画像読み込み

raw_image = Image.open("sample_image.jpg").convert("RGB")
display(raw_image)
image = vis_processors["eval"](raw_image).unsqueeze(0).to(device)

分類クラスを任意に設定する

クラスはなんでも設定できます。今回は全クラス画像に写っているものにして、どれに分類されるか試してみました。

cls_names = ["woman", "dress", "balloon", "sky", "meadow"]

# (optional) add prompt when we want to use the model for zero-shot classification
from lavis.processors.blip_processors import BlipCaptionProcessor
text_processor = BlipCaptionProcessor(prompt="A picture of ")

cls_prompt = [text_processor(cls_nm) for cls_nm in cls_names]
cls_prompt

分類実行

画像とクラスのテキストをそれぞれemmbeding（数値行列）にして、差分から類似度を出します。

sample = {"image": image, "text_input": cls_names}

image_features = model.extract_features(sample, mode="image").image_embeds_proj[:, 0]
text_features = model.extract_features(sample, mode="text").text_embeds_proj[:, 0]

sims = (image_features @ text_features.t())[0] / model.temp
probs = torch.nn.Softmax(dim=0)(sims).tolist()

for cls_nm, prob in zip(cls_names, probs):
    print(f"{cls_nm}: \t {prob:.3%}")

woman: 1.623%
dress: 2.220%
balloon: 86.563%
sky: 8.036%
meadow: 1.557%

気球が一番確率高いと分類されました。

🐣

フリーランスエンジニアです。
AIについて色々記事を書いていますのでよかったらプロフィールを見てみてください。

もし以下のようなご要望をお持ちでしたらお気軽にご相談ください。
AIサービスを開発したい、ビジネスにAIを組み込んで効率化したい、AIを使ったスマホアプリを開発したい、
ARを使ったアプリケーションを作りたい、スマホアプリを作りたいけどどこに相談したらいいかわからない…

いずれも中間コストを省いたリーズナブルな価格でお請けできます。

お仕事のご相談はこちらまで
rockyshikoku@gmail.com

機械学習やAR技術を使ったアプリケーションを作っています。
機械学習／AR関連の情報を発信しています。

Twitter
Medium
GitHub

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up