1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

画像とテキストをそれぞれembeddingにし、比べることのできるモデル

使いかた

インストール

pip install open_clip_torch

たとえば、以下の画像をembeddingにして、3つのテキストをembeddingにしてテンソルの積を計算し、最も確率の高いものをsoftmaxで選びます。

import torch
from PIL import Image
import open_clip
model, _, preprocess = open_clip.create_model_and_transforms('ViT-B-32-quickgelu', pretrained='laion400m_e32')
tokenizer = open_clip.get_tokenizer('ViT-B-32-quickgelu')

image = preprocess(Image.open("cat.jpeg")).unsqueeze(0)
text = tokenizer(["a diagram", "a dog", "a cat"])

with torch.no_grad(), torch.cuda.amp.autocast():
    image_features = model.encode_image(image)
    text_features = model.encode_text(text)
    image_features /= image_features.norm(dim=-1, keepdim=True)
    text_features /= text_features.norm(dim=-1, keepdim=True)

    text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)
torch.set_printoptions(sci_mode=False)

print("Label probs:", text_probs)

Label probs: tensor([[ 0.0000, 0.0000, 1.0000]])

["a diagram", "a dog", "a cat"]

 のうち3つ目の "a cat" が一番確率が高い。

🐣


フリーランスエンジニアです。
お仕事のご相談こちらまで
rockyshikoku@gmail.com

機械学習、ARアプリ(Web/iOS)を作っています。
機械学習/AR関連の情報を発信しています。

Twitter
Medium
GitHub

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?