驚き。3D立体視のエンジニアリング。脳がそれを立体的と認識してしまう。

Last updated at 2024-08-25Posted at 2024-08-21

タイトル: 東京のプログラマーと3D立体視の秘密

東京の朝、静かなマンションの一室で、一人のプログラマーがコーヒーを片手にモニターを見つめていた。名前は健太。いつも新しい技術に挑戦することが好きで、今日も一つの課題に取り組んでいた。彼の目の前には、立体視の仕組みを解明し、それを実際にプログラムで再現しようという野心的な目標があった。

「3D立体視を、どうやってプログラムで再現するんだろう…」健太は自問自答しながら、キーボードを叩き始めた。

健太は、まず立体視の基本から考え始めた。3D立体視が立体的に見えるのは、左目と右目で少し異なる映像を見ているからだ。それぞれの目に映る画像が微妙に異なることで、脳がそれを立体的に認識する。これは、人間の視覚システムが持つ不思議な力だ。

「よし、まずはその異なる視点をプログラムで再現してみよう。」健太は考えた。

彼はまず、仮想の3D空間を作り、その中に立方体を描いた。次に、カメラの視点を左右に少しずらし、それぞれの視点から立方体を観察するプログラムを作成した。これはまるで、実際に自分の目で物体を見るかのように、カメラが少しずつ違う角度から立方体を見ているようなものだ。

「これで左目用と右目用の画像を生成できたぞ。」健太は一息ついた。

彼は、その2つの画像を並べて表示するコードを書いた。画面には左右に分かれた2つの立方体が映し出された。しかし、これをただ眺めただけでは立体感は感じられない。健太は頭をひねった。

「どうすれば、この2つの画像を立体的に見せられるんだろう…」彼は考えた。そして、彼は閃いた。

「そうだ、視差を利用しよう！」

彼は、自作の段ボール製の仕切りを使い、左右の目でそれぞれの画像だけを見るようにした。そして、スマホをその装置にセットし、生成した画像を表示してみた。右目には右側の画像、左目には左側の画像が映し出されるように工夫したのだ。

健太は期待と不安が入り混じった気持ちで、スマホを覗き込んだ。その瞬間、彼の目の前に立体的な立方体が浮かび上がった。まるでその場に存在しているかのようなリアルな3D映像に、健太は驚きと喜びを感じた。

「やった！これだ！これが本物の立体視だ！」彼は思わず声を上げた。

その日、健太は何度もその3D映像を眺め続けた。自分の手で作り上げた立体視のプログラムが、彼に新しい世界を見せてくれたのだ。東京の小さな一室で、健太は一つの大きな目標を達成した。そして、この成功は、彼にさらに新しい挑戦への意欲をもたらした。

「次は、もっと複雑な3Dシーンを作ってみようかな…」健太は次なる冒険に思いを馳せた。

その夜、東京の空の下で、一人のプログラマーが新たな技術を手に入れた。その技術は、彼に新しい視点を与え、未来への道を切り開くことになるだろう。

結構立体的。驚き。脳が立体と認識します。

ついた手の作成:

段ボールや厚紙を使って、右目と左目の間に垂直に立てるような仕切り（ついた手）を作ります。これにより、右目は右目用の映像だけ、左目は左目用の映像だけを見ることができます。

ついた手の設置:

ついた手を額から鼻の先にかけて、ちょうど左右の視界を分けるように配置します。これで、両目がそれぞれ対応する映像だけを見ることができ、立体視が可能になります。

映像を見る:

例えば、左右に並べて表示された2つの映像（右目用と左目用）をそれぞれの目で見ます。このとき、ついた手がしっかりと左右の視界を分けていると、立体的に見えるはずです。ただし、この方法には個人差があり、慣れが必要です。(100円ショップののVRゴーグルがあればベストです。）

左目と右目で少し異なる映像のコード。

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

# 3D空間に点をプロットする関数
def plot_3d_scene(ax, camera_shift):
    ax.set_xlim([-10, 10])
    ax.set_ylim([-10, 10])
    ax.set_zlim([-10, 10])

    # 中央に立方体をプロット
    r = [-2, 2]
    for s in r:
        ax.plot3D([s, s], r, r, color='b')
        ax.plot3D(r, [s, s], r, color='b')
        ax.plot3D(r, r, [s, s], color='b')

    # カメラ位置をずらして撮影
    ax.view_init(elev=20., azim=60. + camera_shift)

# 左目用の画像を生成
fig = plt.figure()
ax = fig.add_subplot(121, projection='3d')
plot_3d_scene(ax, camera_shift=-5)
ax.set_title("Left Eye")

# 右目用の画像を生成
ax = fig.add_subplot(122, projection='3d')
plot_3d_scene(ax, camera_shift=5)
ax.set_title("Right Eye")

plt.show()

3D Stereo Renderer　球を表現

# 必要なライブラリのインストール
!pip install numpy matplotlib pillow gradio

import numpy as np
from PIL import Image
import gradio as gr

# ベクトルを表現するクラス Vec3
class Vec3:
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z

    def __add__(self, other):
        return Vec3(self.x + other.x, self.y + other.y, self.z + other.z)

    def __sub__(self, other):
        return Vec3(self.x - other.x, self.y - other.y, self.z - other.z)

    def __mul__(self, other):
        if isinstance(other, Vec3):
            return Vec3(self.x * other.x, self.y * other.y, self.z * other.z)
        else:
            return Vec3(self.x * other, self.y * other, self.z * other)

    def dot(self, other):
        return self.x * other.x + self.y * other.y + self.z * other.z

    def normalize(self):
        length = np.sqrt(self.x**2 + self.y**2 + self.z**2)
        return self * (1.0 / length)

    def reflect(self, normal):
        return self - normal * 2 * self.dot(normal)

    def __neg__(self):
        return Vec3(-self.x, -self.y, -self.z)

    def to_color(self):
        return (int(255 * np.clip(self.x, 0, 1)), int(255 * np.clip(self.y, 0, 1)), int(255 * np.clip(self.z, 0, 1)))

# 球を表現するクラス Sphere
class Sphere:
    def __init__(self, center, radius, color, specular):
        self.center = center
        self.radius = radius
        self.color = color
        self.specular = specular

    def intersect(self, ray_origin, ray_dir):
        oc = ray_origin - self.center
        a = ray_dir.dot(ray_dir)
        b = 2.0 * oc.dot(ray_dir)
        c = oc.dot(oc) - self.radius * self.radius
        discriminant = b * b - 4 * a * c

        if discriminant < 0:
            return False, None
        else:
            t1 = (-b - np.sqrt(discriminant)) / (2.0 * a)
            t2 = (-b + np.sqrt(discriminant)) / (2.0 * a)
            return True, min(t1, t2) if t1 > 0 else t2

def ray_trace(ray_origin, ray_dir, spheres, light, light_strength):
    color = Vec3(0, 0, 0)
    nearest_t = float('inf')
    hit_sphere = None

    for sphere in spheres:
        hit, t = sphere.intersect(ray_origin, ray_dir)
        if hit and t < nearest_t:
            nearest_t = t
            hit_sphere = sphere

    if hit_sphere:
        hit_point = ray_origin + ray_dir * nearest_t
        normal = (hit_point - hit_sphere.center).normalize()
        view_dir = -ray_dir
        light_dir = (light - hit_point).normalize()
        reflect_dir = light_dir.reflect(normal)

        diffuse = max(normal.dot(light_dir), 0)
        specular = max(view_dir.dot(reflect_dir), 0) ** hit_sphere.specular
        color = hit_sphere.color * (diffuse * light_strength + specular)

    return color

def render(image_width, image_height, spheres, light, light_strength, camera_offset):
    aspect_ratio = image_width / image_height
    camera_origin = Vec3(camera_offset, 0, -1)
    image = Image.new("RGB", (image_width, image_height))

    for y in range(image_height):
        for x in range(image_width):
            pixel_x = (2 * (x + 0.5) / image_width - 1) * aspect_ratio
            pixel_y = 1 - 2 * (y + 0.5) / image_height
            pixel_pos = Vec3(pixel_x, pixel_y, 0)

            ray_dir = (pixel_pos - camera_origin).normalize()
            color = ray_trace(camera_origin, ray_dir, spheres, light, light_strength)
            image.putpixel((x, y), color.to_color())

    return image

def generate_random_scene(num_spheres, min_specular, max_specular):
    spheres = []
    for _ in range(num_spheres):
        center = Vec3(np.random.uniform(-1, 1), np.random.uniform(-1, 1), np.random.uniform(1, 3))
        radius = np.random.uniform(0.4, 0.6)
        color = Vec3(np.random.rand(), np.random.rand(), np.random.rand())
        specular = np.random.uniform(min_specular, max_specular)
        spheres.append(Sphere(center, radius, color, specular))
    return spheres

def create_image_pair(num_spheres, min_specular, max_specular, light_strength):
    image_width = 800
    image_height = 400
    light = Vec3(2, 2, -1)

    spheres = generate_random_scene(num_spheres, min_specular, max_specular)
    
    # 立体視のための左右画像の生成
    left_image = render(image_width, image_height, spheres, light, light_strength, camera_offset=-0.05)
    right_image = render(image_width, image_height, spheres, light, light_strength, camera_offset=0.05)

    # 画像を横に並べて少し重ねる
    overlap = 100  # 重ねる幅
    combined_image_width = 2 * image_width - overlap
    combined_image_height = image_height
    combined_image = Image.new("RGB", (combined_image_width, combined_image_height))

    combined_image.paste(left_image, (0, 0))
    combined_image.paste(right_image, (image_width - overlap, 0))

    return combined_image

# Gradioインターフェース
def generate_combined_image(num_spheres, min_specular, max_specular, light_strength):
    combined_image = create_image_pair(num_spheres, min_specular, max_specular, light_strength)
    return combined_image

# Gradioインターフェースの設定
interface = gr.Interface(
    fn=generate_combined_image,
    inputs=[
        gr.Slider(1, 10, value=5, step=1, label="Number of Spheres"),
        gr.Slider(10, 300, value=100, step=10, label="Min Specular"),
        gr.Slider(10, 300, value=200, step=10, label="Max Specular"),
        gr.Slider(0.1, 5.0, value=1.0, step=0.1, label="Light Strength")
    ],
    outputs="image",
    live=False,
    title="3D Stereo Renderer",
    description="Adjust the number of spheres, their specular highlights, and light strength. Press 'Render' to generate the combined stereo image."
)

# Gradioインターフェースを起動
interface.launch()

3D Stereo Renderer　箱を表現

# 必要なライブラリのインストール
!pip install numpy matplotlib pillow gradio

import numpy as np
from PIL import Image
import gradio as gr

# ベクトルを表現するクラス Vec3
class Vec3:
    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z

    def __add__(self, other):
        return Vec3(self.x + other.x, self.y + other.y, self.z + other.z)

    def __sub__(self, other):
        return Vec3(self.x - other.x, self.y - other.y, self.z - other.z)

    def __mul__(self, other):
        if isinstance(other, Vec3):
            return Vec3(self.x * other.x, self.y * other.y, self.z * other.z)
        else:
            return Vec3(self.x * other, self.y * other, self.z * other)

    def dot(self, other):
        return self.x * other.x + self.y * other.y + self.z * other.z

    def normalize(self):
        length = np.sqrt(self.x**2 + self.y**2 + self.z**2)
        return self * (1.0 / length)

    def reflect(self, normal):
        return self - normal * 2 * self.dot(normal)

    def __neg__(self):
        return Vec3(-self.x, -self.y, -self.z)

    def to_color(self):
        return (int(255 * np.clip(self.x, 0, 1)), int(255 * np.clip(self.y, 0, 1)), int(255 * np.clip(self.z, 0, 1)))

# 直方体を表現するクラス Box
class Box:
    def __init__(self, min_corner, max_corner, color, specular):
        self.min_corner = min_corner
        self.max_corner = max_corner
        self.color = color
        self.specular = specular

    def intersect(self, ray_origin, ray_dir):
        t_min = (self.min_corner.x - ray_origin.x) / ray_dir.x if ray_dir.x != 0 else float('-inf')
        t_max = (self.max_corner.x - ray_origin.x) / ray_dir.x if ray_dir.x != 0 else float('inf')

        if t_min > t_max:
            t_min, t_max = t_max, t_min

        t_ymin = (self.min_corner.y - ray_origin.y) / ray_dir.y if ray_dir.y != 0 else float('-inf')
        t_ymax = (self.max_corner.y - ray_origin.y) / ray_dir.y if ray_dir.y != 0 else float('inf')

        if t_ymin > t_ymax:
            t_ymin, t_ymax = t_ymax, t_ymin

        if (t_min > t_ymax) or (t_ymin > t_max):
            return False, None

        if t_ymin > t_min:
            t_min = t_ymin

        if t_ymax < t_max:
            t_max = t_ymax

        t_zmin = (self.min_corner.z - ray_origin.z) / ray_dir.z if ray_dir.z != 0 else float('-inf')
        t_zmax = (self.max_corner.z - ray_origin.z) / ray_dir.z if ray_dir.z != 0 else float('inf')

        if t_zmin > t_zmax:
            t_zmin, t_zmax = t_zmax, t_zmin

        if (t_min > t_zmax) or (t_zmin > t_max):
            return False, None

        if t_zmin > t_min:
            t_min = t_zmin

        if t_zmax < t_max:
            t_max = t_zmax

        return True, t_min if t_min > 0 else t_max

def ray_trace(ray_origin, ray_dir, boxes, light, light_strength):
    color = Vec3(0, 0, 0)
    nearest_t = float('inf')
    hit_box = None

    for box in boxes:
        hit, t = box.intersect(ray_origin, ray_dir)
        if hit and t < nearest_t:
            nearest_t = t
            hit_box = box

    if hit_box:
        hit_point = ray_origin + ray_dir * nearest_t
        normal = Vec3(0, 0, 0)

        # どの面に当たったかを判定する
        if np.isclose(hit_point.x, hit_box.min_corner.x):
            normal = Vec3(-1, 0, 0)
        elif np.isclose(hit_point.x, hit_box.max_corner.x):
            normal = Vec3(1, 0, 0)
        elif np.isclose(hit_point.y, hit_box.min_corner.y):
            normal = Vec3(0, -1, 0)
        elif np.isclose(hit_point.y, hit_box.max_corner.y):
            normal = Vec3(0, 1, 0)
        elif np.isclose(hit_point.z, hit_box.min_corner.z):
            normal = Vec3(0, 0, -1)
        elif np.isclose(hit_point.z, hit_box.max_corner.z):
            normal = Vec3(0, 0, 1)

        view_dir = -ray_dir
        light_dir = (light - hit_point).normalize()
        reflect_dir = light_dir.reflect(normal)

        diffuse = max(normal.dot(light_dir), 0)
        specular = max(view_dir.dot(reflect_dir), 0) ** hit_box.specular
        color = hit_box.color * (diffuse * light_strength + specular)

    return color

def render(image_width, image_height, boxes, light, light_strength, camera_offset):
    aspect_ratio = image_width / image_height
    camera_origin = Vec3(camera_offset, 0, -1)
    image = Image.new("RGB", (image_width, image_height))

    for y in range(image_height):
        for x in range(image_width):
            pixel_x = (2 * (x + 0.5) / image_width - 1) * aspect_ratio
            pixel_y = 1 - 2 * (y + 0.5) / image_height
            pixel_pos = Vec3(pixel_x, pixel_y, 0)

            ray_dir = (pixel_pos - camera_origin).normalize()
            color = ray_trace(camera_origin, ray_dir, boxes, light, light_strength)
            image.putpixel((x, y), color.to_color())

    return image

def generate_random_boxes(num_boxes, min_specular, max_specular):
    boxes = []
    for _ in range(num_boxes):
        min_corner = Vec3(np.random.uniform(-1, 0), np.random.uniform(-1, 0), np.random.uniform(1, 2))
        max_corner = min_corner + Vec3(np.random.uniform(0.4, 0.6), np.random.uniform(0.4, 0.6), np.random.uniform(0.4, 0.6))
        color = Vec3(np.random.rand(), np.random.rand(), np.random.rand())
        specular = np.random.uniform(min_specular, max_specular)
        boxes.append(Box(min_corner, max_corner, color, specular))
    return boxes

def create_image_pair(num_boxes, min_specular, max_specular, light_strength):
    image_width = 800
    image_height = 400
    light = Vec3(2, 2, -1)

    boxes = generate_random_boxes(num_boxes, min_specular, max_specular)
    
    # 立体視のための左右画像の生成
    left_image = render(image_width, image_height, boxes, light, light_strength, camera_offset=-0.05)
    right_image = render(image_width, image_height, boxes, light, light_strength, camera_offset=0.05)

    # 画像を横に並べて少し重ねる
    overlap = 100  # 重ねる幅
    combined_image_width = 2 * image_width - overlap
    combined_image_height = image_height
    combined_image = Image.new("RGB", (combined_image_width, combined_image_height))

    combined_image.paste(left_image, (0, 0))
    combined_image.paste(right_image, (image_width - overlap, 0))

    return combined_image

# Gradioインターフェース
def generate_combined_image(num_boxes, min_specular, max_specular, light_strength):
    combined_image = create_image_pair(num_boxes, min_specular, max_specular, light_strength)
    return combined_image

# Gradioインターフェースの設定
interface = gr.Interface(
    fn=generate_combined_image,
    inputs=[
        gr.Slider(1, 10, value=5, step=1, label="Number of Boxes"),
        gr.Slider(10, 300, value=100, step=1, label="Min Specular"),
        gr.Slider(10, 300, value=100, step=1, label="Max Specular"),
        gr.Slider(0.1, 5.0, value=1.0, step=0.1, label="Light Strength")
    ],
    outputs=gr.Image(type="pil"),
title="Stereo 3D Box Renderer"
)

#Gradioインターフェースの起動
interface.launch()

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up