More than 5 years have passed since last update.

ARCore(for Unity API)のカメラ画像をTexture2DやJPEGで取得する

Last updated at 2018-10-03Posted at 2018-10-03

概要

ARCore for Unity APIのカメラ画像は、ARCoreDevice prefabのFirstPersonCameraのスクリプトARCoreBackgroundRenderer.csにてテクスチャとして描画されます。

本記事は、このテクスチャ画像をjpg形式で取得する方法に関するものです。

jpg形式で取得できると、自前で画像処理したり、カメラ画像をクラウドAPIに送信してAI系サービスに使ったりと、応用の幅が広がります。

version

Unity: 2018 2.5f1
ARCore: v1.4.1

はじめに

基本的には、ARCoreのカメラ画像取得はGoogleARCore.Frame.CameraImageクラスのTextureプロパティ、またはAquireCameraImageBytesメソッドで行います。

どちらの方法でも取得できるカメラ画像はRGBではなくYUV形式になっています。Textureで取得できると何かと便利なのですが、TextureプロパティはAndroid実機で今回の目的で扱うにはうまく動作しないようです(※詳細後述)。

上記とは別に、公式のComputerVisionExampleにあるTextureReaderを利用すると、RGBA形式でカメラ画像を取得できます。

本記事では、TextureReaderを使う方法、およびAquireCameraImageBytesを使った場合の方法について紹介します。
なお、おすすめは断然「TextureReaderを使う方法」です。

(ReadPixelsでスクリーンショットを撮って使うという手もあるようですが、試していません。多分パフォーマンス的にデメリットが大きいと思われます)

TextureReaderを使う方法（おすすめ）

ARCoreのSDK同梱のComputerVisionExampleにあるTextureReader.csを使います。
TextureReader.csはTextureReaderApi.csを利用してカメラ画像のデータを指定した解像度のTexture2D形式で取り出してくれます。
TextureReaderApi.csは裏でC++コードで処理してくれているので、C#で自前で処理を書くよりも高速です。

メリット
- 動作が速い。
- YUV形式からRGB形式に変換するコードを自前で書く必要がない。
- byte[]形式を経由するので、ピクセルのRGB値を取得でき、自前の画像処理などが可能。
デメリット
- 特にないと思う

TextureReader.csの使い方

Scene上の適当なオブジェクトにTextureReader.csをアタッチ
別のスクリプト上でTextureReaderクラスを取得し、OnImageAvailableCallbackFuncをセット
OnImageAvailableCallbackFunc内で、カメラ画像のデータ(IntPtr)を取得した際の処理を実装

TextureReaderのインスペクタ画面からいじれるパラメータ

Image Width, Image Height : 取得する画像のサイズ
Image Sample Mode : 取得する画像のアスペクト比
- Keep Aspect Ratio : 内部テクスチャの比率で維持する（内部テクスチャのサイズはTextureReader.csにて1920x1080の定数で定義されている）
- Cover Full Viewport : Image Width, Image Heightの比率にリサイズする
Image Format : 取得する画像の色
- Image Format Grayscale : グレースケール
- Image Format Color : カラー（RGBA）

IntPtrからTexture2Dにデータをセットする部分はComputerVisionExampleのEdgeDetector.csが参考になります。
jpgへの変換はTexture2DのEncodeJPGメソッドで行います。

ざっくりした例↓

public byte[] jpg;

private void OnImageAvailableCallbackFunc(TextureReaderApi.ImageFormatType format, int width, int height, IntPtr pixelBuffer, int bufferSize){
    byte[] data = new byte[bufferSize];
    System.Runtime.InteropServices.Marshal.Copy(pixelBuffer, data, 0, bufferSize);
    Texture2D _tex = new Texture2D(width, height, TextureFormat.RGBA32, false, false);
    _tex.LoadRawTextureData(data);
    _tex.Apply();
    jpg = _tex.EncodeToJPG();
}

ただしTextureReaderで取得されるポインタをそのままTexture2Dにすると、左右反転かつJPG画像は270°回転状態になります。
これはbyte[]型のタイミングで自前で修正できます。
※テクスチャ座標系と画像座標系の違いで反転していると思われます。
※270°回転はAndroidアプリをPortraitで動作させた場合です。Landscapeだと回転はしていないかも。

また、Image WidthとImage Heightをインスペクタで指定してもよいのですが、基本的にアスペクト比はカメラで取得される画像に合わせたいことが多いと思うので、Image Sample ModeはCover Full Viewportとし、ARCoreのカメラが取得した画像のサイズを使ってスクリプト上でセットしてあげるのがいいのではないかと思っています。

そのあたりの処理も含めてラッパクラスを作りました。
TextureReader.csをアタッチしたオブジェクトにアタッチして、FrameTextureプロパティでTextureを取得できます。

TextureReaderWrapper.cs

using System;
using UnityEngine;
using GoogleARCore;
using GoogleARCore.Examples.ComputerVision;

public class TextureReaderWrapper : MonoBehaviour {
    /// <summary>
    /// 取得するTextureのサイズの、カメラ画像に対する割合
    /// </summary>
    public float TextureSizeRatio = 1.0f;

    /// <summary>
    /// カメラ画像のデータ群
    /// </summary>
    private TextureReaderApi.ImageFormatType format;
    private int width;
    private int height;
    private IntPtr pixelBuffer;
    private int bufferSize = 0;

    /// <summary>
    /// カメラ画像取得用API
    /// </summary>
    private TextureReader TextureReader = null;

    /// <summary>
    /// カメラ画像のサイズに合わせてTextureReaderをセットしたかどうかのフラグ
    /// </summary>
    private bool setFrameSizeToTextureReader = false;


    public void Awake()
    {
        // カメラ画像取得時に呼ばれるコールバック関数を定義
        TextureReader = GetComponent<TextureReader>();
        TextureReader.OnImageAvailableCallback += OnImageAvailableCallbackFunc;
    }

    private void OnImageAvailableCallbackFunc(TextureReaderApi.ImageFormatType format, int width, int height, IntPtr pixelBuffer, int bufferSize)
    {
        this.format = format;
        this.width = width;
        this.height = height;
        this.pixelBuffer = pixelBuffer;
        this.bufferSize = bufferSize;
    }


    // Use this for initialization
    void Start()
    {
    }

    // Update is called once per frame
    void Update()
    {
        // TextureReaderにカメラ画像のサイズをセットする。実行は一回だけ
        if (!setFrameSizeToTextureReader)
        {
            using (var image = Frame.CameraImage.AcquireCameraImageBytes())
            {
                if (!image.IsAvailable)
                {
                    return;
                }

                TextureReader.ImageWidth = (int)(image.Width * TextureSizeRatio);
                TextureReader.ImageHeight = (int)(image.Height * TextureSizeRatio);
                TextureReader.Apply();

                setFrameSizeToTextureReader = true;
            }
        }
    }

    public Texture2D FrameTexture
    {
        get
        {
            if (bufferSize != 0)
            {
                // TextureReaderが取得した画像データのポインタからデータを取得
                byte[] data = new byte[bufferSize];
                System.Runtime.InteropServices.Marshal.Copy(pixelBuffer, data, 0, bufferSize);
                // 向きが270回転と反転しているので補正する
                byte[] correctedData = Rotate90AndFlip(data, width, height, format == TextureReaderApi.ImageFormatType.ImageFormatGrayscale);
                // Texture2Dを作成 90度回転させているのでwidth/heightを入れ替える
                Texture2D _tex = new Texture2D(height, width, TextureFormat.RGBA32, false, false);
                _tex.LoadRawTextureData(correctedData);
                _tex.Apply();

                return _tex;
            }
            else
            {
                return null;
            }
        }
    }


    private byte[] Rotate90AndFlip(byte[] img, int width, int height, bool isGrayscale)
    {
        int srcChannels = isGrayscale ? 1 : 4;
        int dstChannels = 4; //出力は常にRGBA32にする
        byte[] newImg = new byte[width * height * dstChannels];

        for (int i = 0; i < height; i++)
        {
            for (int j = 0; j < width; j++)
            {
                //imgのindex
                int p = (i * width + j) * srcChannels;

                //newImgに対するindex. 90度回転と反転を入れている
                int np = ((width - j - 1) * height + (height - i - 1)) * dstChannels;

                // グレースケールでもRGBで扱えるようにしておく
                if (isGrayscale)
                {
                    newImg[np] = img[p]; // R
                    newImg[np + 1] = img[p]; // G
                    newImg[np + 2] = img[p]; // B
                    newImg[np + 3] = 255; // A
                }
                else
                {
                    for (int c = 0; c < dstChannels; c++)
                    {
                        newImg[np + c] = img[p + c];
                    }
                }
            }
        }

        return newImg;
    }
}

参考記事：
google-ar/arcore-unity-sdk Issues - Camera Capture Image #221

AquireCameraImageBytesを使う方法

AquireCameraImageBytesメソッドの戻り値CameraImageBytesの中身は、APIドキュメントにあるように「YUV-420-888 format」です。
フォーマットの解説はこちらの記事が大変参考になりました。

方針としてはYUV形式のCameraImageBytesオブジェクトをjpgに変換することになります。
本記事では以下2つの方法について紹介します。

Androidネイティブ機能を使う方法
自前でピクセルごとのRGB値を計算する方法

なおCameraImageBytesのY, U, Vプロパティにて各チャネルのデータの先頭ポインタを、Width, Heightプロパティにて画像のサイズが取得できますのでこれらを使って変換をかけていきます。

1. Androidネイティブ機能を使う方法

Androidネイティブ機能は、AndroidJavaClassやAndroidJavaObjectなどを使うことでUnityからも使用できます。
Androidネイティブ機能のandroid.graphics.YuvImageクラスのcompressToJpegメソッドを使って、YUV形式の画像を直接jpg形式に変換してしまう方法です。

メリット
- コード量が少ない。
デメリット
- 動作が遅い。
- 直接jpg形式になるので、自前の画像処理などのためにピクセルのRGB値を取得できない。

    public static byte[] YuvImageToJpg(CameraImageBytes image, Rect cropRect)
    {
        int bufsize = (int)(image.Width * image.Height * 1.5);
        byte[] buf = new byte[bufsize];
        System.Runtime.InteropServices.Marshal.Copy(image.Y, buf, 0, bufsize);

        AndroidJavaClass ImageFormat = new AndroidJavaClass("android.graphics.ImageFormat");
        int imageFormat = ImageFormat.GetStatic<int>("NV21");

        AndroidJavaObject yuvImage = new AndroidJavaObject("android.graphics.YuvImage", buf, imageFormat, image.Width, image.Height, null);
        AndroidJavaObject r = new AndroidJavaObject("android.graphics.Rect", (int)cropRect.xMin, (int)cropRect.yMin, (int)cropRect.xMax, (int)cropRect.yMax);
        AndroidJavaObject outStrm = new AndroidJavaObject("java.io.ByteArrayOutputStream");

        yuvImage.Call<bool>("compressToJpeg", r, 100, outStrm);

        byte[] jpgImage = outStrm.Call<byte[]>("toByteArray");

        return jpgImage;
    }

2. 自前でピクセルごとのRGB値を計算する方法

メリット
- 動作が比較的速い。
- byte[]形式を経由するので、ピクセルのRGB値を取得でき、自前の画像処理などが可能。
デメリット
- コード量が多い。バグが出やすい。

※初回投稿時、コード掲載を忘れていました。以下追記。

自前変換のコードです。YUV_420_888には色情報の格納方式によりI420, NV21, NV12と形式が複数あり、その判定も含めて処理する必要があります。
（私の持っている実機はNV21だったので、NV12とI420のコードは未テストです。。参考まで）
なお本コードでは、画像の切り抜きも同時に行えるようにしてあります。


    /// <summary>
    /// YUV420_888形式のCameraImageBytesをTexture2Dに変換する。
    /// ARCoreで取得した画像からそのまま変換すると、画像が-90度回転、かつ上下左右反転してしまうため、
    /// ピクセルの値を格納する際に対処している。
    /// </summary>
    /// <param name="image">CameraImageBytes</param>
    /// <param name="r">切り取る四角形</param>
    /// <returns></returns>
    public static Texture2D ConvertYuvImageToRGBA32Texture(CameraImageBytes image, Rect r)
    {
        // 切り取る四角形の左上頂点座標と幅・高さを取得
        int sx = (int)r.x;
        int sy = (int)r.y;
        int width = (int)r.width;
        int height = (int)r.height;

        // CameraImageBytesの中の、YUV各データのポインタを取得
        var ptrY = image.Y;
        var ptrU = image.U;
        var ptrV = image.V;

        // CameraImageBytesの幅・高さを取得
        int iwidth = image.Width;
        int iheight = image.Height;

        // Texture2Dにセットするデータ列 RGBA型（4channel)
        byte[] rgba = new byte[width * height * 4];

        if (image.UVPixelStride == 1)
        {
            // UVPixelStrideが1なら、I420。UとVのバッファは別の領域に格納されている。
            // !! このI420のコードは未テスト !!

            //byte列を確保。U, Vは幅・高さがYの半分に間引かれている。
            byte[] bufferY = new byte[iwidth * iheight];
            byte[] bufferU = new byte[iwidth / 2 * iheight / 2];
            byte[] bufferV = new byte[iwidth / 2 * iheight / 2];

            // Raw dataをbyte列へ移動
            System.Runtime.InteropServices.Marshal.Copy(ptrY, bufferY, 0, iwidth * iheight);
            System.Runtime.InteropServices.Marshal.Copy(ptrU, bufferU, 0, iwidth / 2 * iheight / 2);
            System.Runtime.InteropServices.Marshal.Copy(ptrV, bufferV, 0, iwidth / 2 * iheight / 2);
            
            // ピクセルごとにRGBAへ変換
            for (int i = sy; i < sy + height; i++)
            {
                for (int j = sx; j < sx + width; j++)
                {
                    int yp = i * iwidth + j; // Yに対するindex
                    int uvp = (int)(i / 2) * (int)(iwidth / 2) + (int)(j / 2); // UVに対するindex
                    var y = bufferY[yp];
                    var u = bufferU[uvp];
                    var v = bufferV[uvp];

                    //rgbaに対するindex 90度回転と反転を入れている
                    int p = (width - (j - sx) - 1) * height * 4 + (height - (i - sy) - 1) * 4;

                    // 色変換ロジック
                    float yf = (float)y - 16.0f;
                    float uf = (float)u - 128f;
                    float vf = (float)v - 128f;
                    rgba[p + 0] = (byte)clip((int)((298 * yf + 409 * vf + 128) / 256f)); // R
                    rgba[p + 1] = (byte)clip((int)((298 * yf - 100 * uf - 208 * vf + 128) / 256f)); // G
                    rgba[p + 2] = (byte)clip((int)((298 * yf + 516 * uf + 128) / 256f)); // B
                    rgba[p + 3] = 255; // A
                }
            }
        }else if(image.UVPixelStride == 2)
        {
            // UVPixelStrideが2なら、NV21かNV12。UとVのバッファは同じ領域に、交互に格納されている。

            // NV21ならUが先、NV12ならVが先。
            bool isNV21 = true;
            if ((ulong)ptrU < (ulong)ptrV)
            {
                isNV21 = false;
            }

            // Move raw data into managed buffer.
            byte[] bufferY = new byte[iwidth * iheight];
            byte[] bufferUV = new byte[iwidth * iheight / 2];

            System.Runtime.InteropServices.Marshal.Copy(ptrY, bufferY, 0, iwidth * iheight);
            System.Runtime.InteropServices.Marshal.Copy(ptrU, bufferUV, 0, iwidth * iheight / 2);


            for (int i = sy; i < sy + height; i++)
            {
                for (int j = sx; j < sx + width; j++)
                {
                    int yp = i * iwidth + j; // Yに対するindex
                    var y = bufferY[yp];

                    int uvp = (int)(i / 2) * iwidth + (int)(j / 2) * 2; // U or Vに対するindex

                    // NV21ならUが先、NV12ならVが先。
                    var u = bufferUV[uvp + (isNV21 ? 0 : 1)];
                    var v = bufferUV[uvp + (isNV21 ? 1 : 0)];

                    //rgbaに対するindex 90度回転と反転を入れている
                    int p = (width - (j - sx) - 1) * height * 4 + (height - (i - sy) - 1) * 4;

                    // 色変換ロジック
                    float yf = (float)y - 16.0f;
                    float uf = (float)u - 128f;
                    float vf = (float)v - 128f;

                    rgba[p + 0] = (byte)clip((int)((298 * yf + 409 * vf + 128) / 256f)); // R
                    rgba[p + 1] = (byte)clip((int)((298 * yf - 100 * uf - 208 * vf + 128) / 256f)); // G
                    rgba[p + 2] = (byte)clip((int)((298 * yf + 516 * uf + 128) / 256f)); // B
                    rgba[p + 3] = 255; // A
                }
            }
        }

        // Texture2Dへ格納
        // 90度回転させているので、幅と高さを入れ替える
        Texture2D tex = new Texture2D(height, width, TextureFormat.RGBA32, false, false);
        tex.LoadRawTextureData(rgba);
        tex.Apply();

        return tex;
    }
    
    private static int clip(int v)
    {
        return v < 0 ? 0 : (v > 255 ? 255 : v);
    }

Textureプロパティ（※うまくいかない）

Textureプロパティでカメラ画像のTextureオブジェクトを取得できますが、このTextureはARGB32形式の中にYUVのデータを入れて、専用のシェーダ(ARCore/ARBackgroundシェーダ)でTexture描画するときにつじつまを合わせている、ように見受けられます。
つまりテクスチャとしてオブジェクト上に描画するためのみに使える形式であり、画像処理やJPG変換などには使えないようです。

ARCoreAndroidLifecycleManager.csの256行目のコメントには、以下の記述があります。

           // The Unity-cached size and format of the texture (0x0, ARGB) is not the
           // actual format of the texture. This is okay because the texture is not
           // accessed by pixels, it is accessed with UV coordinates.

実際、UnityEditor上のInstancePreviewではちゃんと取得できるのですが、Android実機になるとsizeが(0, 0)のTextureとなってしまい、うまく取得できません。
公式のComputerVisionExampleでも、画像処理用に画像を取得する際にはAquireCameraImageBytesメソッドを使っています。（なおこのExampleではカラーではなくグレースケール画像を取得しているため、Yプロパティの値をそのままコピーして使っているだけになります。カラー画像を取得したい場合は先述のTextureReader.csを使うのがよいです。）

参考記事：
google-ar/arcore-unity-sdk Issues - Frame.CameraImage.Texture size #76
google-ar/arcore-unity-sdk Issues - Why i can not get texture by API "Frame.CameraImage.Texture"? #330

まとめ

ARCore(for Unity API)のカメラ画像をTexture2DやJPEGで取得するには、ComputerVisionExampleのTextureReader.csを使うのがおすすめ

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up