なんでも深度推定してくれるDepthAnything

Last updated at 2024-04-11Posted at 2024-04-11

画像一枚で深度を推定

これだけ簡単に推定深度がわかると色々できる。

使い方

インストール

git clone https://github.com/LiheYoung/Depth-Anything.git
cd Depth-Anything
pip install -r requirements.txt

実行

from depth_anything.dpt import DepthAnything
from depth_anything.util.transform import Resize, NormalizeImage, PrepareForNet

import cv2
import torch
from torchvision.transforms import Compose
import torch.nn.functional as F
import numpy as np

encoder = 'vits' # can also be 'vitb' or 'vitl'
depth_anything = DepthAnything.from_pretrained('LiheYoung/depth_anything_{:}14'.format(encoder)).eval()

transform = Compose([
    Resize(
        width=518,
        height=518,
        resize_target=False,
        keep_aspect_ratio=True,
        ensure_multiple_of=14,
        resize_method='lower_bound',
        image_interpolation_method=cv2.INTER_CUBIC,
    ),
    NormalizeImage(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    PrepareForNet(),
])

image = cv2.cvtColor(cv2.imread('bridge.jpg'), cv2.COLOR_BGR2RGB) / 255.0
image = transform({'image': image})['image']
input = torch.from_numpy(image).unsqueeze(0)

# depth shape: 1xHxW
result = depth_anything(input)

depth = result
h, w = image.shape[:2]
depth = (depth - depth.min()) / (depth.max() - depth.min()) * 255.0
depth = depth.detach().cpu().numpy().astype(np.uint8)
depth = depth[0]

cv2.imwrite("depth.jpg",depth)

ビデオでもできる。

python run_video.py --encoder vitl --video-path assets/examples_video --outdir video_depth_vis

🐣

フリーランスエンジニアです。
AIについて色々記事を書いていますのでよかったらプロフィールを見てみてください。

もし以下のようなご要望をお持ちでしたらお気軽にご相談ください。
AIサービスを開発したい、ビジネスにAIを組み込んで効率化したい、AIを使ったスマホアプリを開発したい、
ARを使ったアプリケーションを作りたい、スマホアプリを作りたいけどどこに相談したらいいかわからない…

いずれも中間コストを省いたリーズナブルな価格でお請けできます。

お仕事のご相談はこちらまで
rockyshikoku@gmail.com

機械学習やAR技術を使ったアプリケーションを作っています。
機械学習／AR関連の情報を発信しています。

X
Medium
GitHub

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up