[Metal] テクスチャ（MTLTexture）を使う #iOS

前回の記事（iOSの新グラフィックAPI - Metal入門してみる）でMetalについてちょっと書きました。

触ってみた感想としては「OpenGLよりもだいぶ使いやすいな」という印象です。
OpenGLの場合は色々と手続きが複雑で、最初はなにがどうなっているのかを把握しづらいのが難点です。

が、Metalの場合はCPUとGPUがメモリを共有しているためか設計が比較的分かりやすく、OpenGLを触ったあとにやると非常にやりやすく感じました。

ということで、Metalでテクスチャを使う例をメモ。

テクスチャを利用するフロー

ちなみに前述の通り、テクスチャを利用するのも非常に簡単です。
ざっとフローを書くと、

画像ファイルからMTLTextureを生成する
1. MTLTextureDescriptorオブジェクトを生成する
2. MTLTexutureオブジェクトを生成する
MTLRenderPassDescriptorオブジェクトを生成する
1. colorAttachmentのテクスチャプロパティに、生成したテクスチャオブジェクトを指定する
MTLSamplerStateオブジェクトを生成する（※1）
1. MTLSamplerDescriptorオブジェクトを生成する
MTLRenderCommandEncoderにテクスチャ情報をセットする

という流れになります。
若干手順は多そうに見えますが、OpenGLの「なにをしているんだろう感」からするとだいぶ分かりやすい記述になります。

※1 ... なお、複雑な設定が必要なければシェーダ側で準備することもできるようです。

画像ファイルから`MTLTexture`を生成する

CGImageRefを使い、テクスチャにデータを書き込みます。
以下のようにすることでMTLTextuerオブジェクトを生成することができます。

// テクスチャ用画像を取得
UIImage *image = [UIImage imageNamed:@"hoge.jpg"];
CGImageRef imageRef = image.CGImage;

// サイズなど必要なデータを集める
NSUInteger width  = CGImageGetWidth(imageRef);
NSUInteger height = CGImageGetHeight(imageRef);

// Color Space
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();

// 画像データ情報とメモリ割り当て
NSUInteger componentCount = 4;
uint8_t *rawData = (uint8_t*)calloc(width * height * coumponentCount, sizeof(uint8_t));
NSUInteger bytesPerPixel = 4;
NSUInteger bytesPerRow   = bytesPerPixel * width;
NSUInteger bitsPerComponent = 8;

// BitmapContextを生成
CGContextRef context = CGBitmapContextCreate(rawData, width, height, bitsPerComponent, bytesPerRow, colorSpace, (CGBitmapInfo)(kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big));

// 画像を描画
CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef);

// 解放
CGColorSpaceRelease(colorSpace);
CGContextRelease(imageRef);

// MTLTextureDescriptorオブジェクトの生成。
// ピクセルフォーマット、幅、高さを指定する。
MTLTextureDescriptor *textureDescriptor = [MTLTextureDescriptor texture2DDescriptorWithPixelFormat:MTLPixelFormatRGBA8Unorm width:width height:height mipmapped:YES];

// テクスチャを生成
id <MTLTexture> texture = [device newTextureWithDescriptor:textureDescriptor];

// MTLRegionを生成。（3D空間の範囲も指定できるっぽいが、今回はテクスチャなので`2D`
MTLRegion region = MTLRegionMake2D(0, 0, width, height);

// 画像データをテクスチャに書き込む
[texture replaceRegion:region mipmapLevel:0 withBytes:rawData bytesPerRow:bytesPerRow];

※ ここではローカル変数としてtextureを宣言していますが、実際はプロパティなどにして保持しておかないと解放されてしまうのでそのあたりは適切に利用してください。

`MTLRenderPassDescriptor`を生成する

生成したテクスチャを元に、MTLRenderPassDescriptorを作ります。
OpenGLとは違い、textureプロパティに設定しているだけです。非常に分かりやすいですね。

// MTLRenderPassDescriptorオブジェクトを生成
MTLRenderPassDescriptor *renderPassDescriptor = [MTLRenderPassDescriptor renderPassDescriptor];

// colorAttachmentsに上記で生成した`texture`を設定
renderPassDescriptor.colorAttachments[0].texture     = texture;

// このRender Passが実行されるときの挙動を設定
renderPassDescriptor.colorAttachments[0].loadAction  = MTLLoadActionClear;
renderPassDescriptor.colorAttachments[0].clearColor  = MTLClearColorMake(1.0, 0.0, 0.0, 1.0);
renderPassDescriptor.colorAttachments[0].storeAction = MTLStoreActionStore;

ここでやっていることは、このあとに生成するMTLRenderCommandEncoderオブジェクトで実行する設定のようなものです。

ちなみにloadActionはこのRender Passが実行されるときに、前に書き込まれた情報をどうするかの指定です。
なのでMTLLoadActionClearを指定すると情報がクリアされ、このRender Passで描かれたものだけが表示されます。

別の指定でMTLLoadActionLoadがあり、これを指定するとクリアされずに上書きすることができます。

`MTLSamplerState`オブジェクトを生成する

MTLSamplerStateオブジェクトの生成にはMTLSamplerDescriptorをまず生成します。

MTLSamplerDescriptor *samplerDescriptor = [[MTLSamplerDescriptor alloc] init];
samplerDescriptor.minFilter = MTLSamplerMinMagFilterNearest;
samplerDescriptor.magFilter = MTLSamplerMinMagFilterLinear;
samplerDescriptor.sAddressMode = MTLSamplerAddressModeRepeat;
samplerDescriptor.tAddressMode = MTLSamplerAddressModeRepeat;

// `MTLSamplerState`を生成
self.sampler = [self.device newSamplerStateWithDescriptor:samplerDescriptor];

やっていることはサンプラの設定ですね。
minFilterとmagFilterは拡大・縮小したときのフィルターについて、sAddressMode、tAddressModeはテクスチャのサイズを超えた際にどう処理するか（繰り返しなど）を設定しています。

設定を施した上でMTLSamplerStateオブジェクトを生成します。

シェーダでテクスチャを受け取る

セットアップが終わったテクスチャをシェーダで扱えるようにするには、ObjC側はシェーダにその通知を、シェーダ側ではそれを受け取る指定をします。

Objective-C側

// テクスチャを受け取って`MTLRenderPassDescriptor `を作るヘルパーメソッド
- (MTLRenderPassDescriptor *)createRenderPassDescriptorWithTexture:(id <MTLTexture>)texture
{
    MTLRenderPassDescriptor *renderPassDescriptor = [[MTLRenderPassDescriptor alloc] init];
    renderPassDescriptor.colorAttachments[0].texture     = texture;
    renderPassDescriptor.colorAttachments[0].loadAction  = MTLLoadActionClear;
    renderPassDescriptor.colorAttachments[0].clearColor  = MTLClearColorMake(0.0, 104.0/255.0, 5.0/255.0, 1.0);
    renderPassDescriptor.colorAttachments[0].storeAction = MTLStoreActionStore;

    return renderPassDescriptor;
}

fragment half4 video_fragment(VertexOut input [[stage_in]],
                              texture2d<float> tex2D [[texture(0)]]) {
    constexpr sampler s_quad(filter::linear);
    float4 color = tex2D.sample(s_quad, input.texCoord);
    return half4(color);
}

ピクセルフォーマット

テクスチャを作る際、格納するデータによって適切にピクセルフォーマットを指定する必要があります。
いわゆる普通のテクスチャである画像の場合はRGBAの順番で並んだものを指定するのが一般的だと思いますが、DepthバッファやStancilバッファなどの場合はそこまでのサイズが必要ないため、別のフォーマットを指定する場合もあります。

定義

ちなみに定義はこんなにあります。

defines-pixel-format

typedef enum : NSUInteger {
   MTLPixelFormatInvalid       = 0,
   /* Ordinary 8 bit formats */
   MTLPixelFormatA8Unorm       = 1,

   MTLPixelFormatR8Unorm       = 10,
   MTLPixelFormatR8Unorm_sRGB  = 11,
   MTLPixelFormatR8Snorm       = 12,
   MTLPixelFormatR8Uint        = 13,
   MTLPixelFormatR8Sint        = 14,
   /* Ordinary 16 bit formats */
   MTLPixelFormatR16Unorm      = 20,
   MTLPixelFormatR16Snorm      = 22,
   MTLPixelFormatR16Uint       = 23,
   MTLPixelFormatR16Sint       = 24,
   MTLPixelFormatR16Float      = 25,

   MTLPixelFormatRG8Unorm      = 30,
   MTLPixelFormatRG8Unorm_sRGB = 31,
   MTLPixelFormatRG8Snorm      = 32,
   MTLPixelFormatRG8Uint       = 33,
   MTLPixelFormatRG8Sint       = 34,
   /* Packed 16 bit formats */
   MTLPixelFormatB5G6R5Unorm      = 40,
   MTLPixelFormatA1BGR5Unorm      = 41,
   MTLPixelFormatABGR4Unorm       = 42,
   /* Ordinary 32 bit formats */
   MTLPixelFormatR32Uint          = 53,
   MTLPixelFormatR32Sint          = 54,
   MTLPixelFormatR32Float         = 55,

   MTLPixelFormatRG16Unorm        = 60,
   MTLPixelFormatRG16Snorm        = 62,
   MTLPixelFormatRG16Uint         = 63,
   MTLPixelFormatRG16Sint         = 64,
   MTLPixelFormatRG16Float        = 65,

   MTLPixelFormatRGBA8Unorm       = 70,
   MTLPixelFormatRGBA8Unorm_sRGB  = 71,
   MTLPixelFormatRGBA8Snorm       = 72,
   MTLPixelFormatRGBA8Uint        = 73,
   MTLPixelFormatRGBA8Sint        = 74,

   MTLPixelFormatBGRA8Unorm       = 80,
   MTLPixelFormatBGRA8Unorm_sRGB  = 81,
   /* Packed 32 bit formats */
   MTLPixelFormatRGB10A2Unorm     = 90,
   MTLPixelFormatRGB10A2Uint      = 91,
   MTLPixelFormatRG11B10Float     = 92,
   MTLPixelFormatRGB9E5Float      = 93,
   /* Ordinary 64 bit formats */
   MTLPixelFormatRG32Uint         = 103,
   MTLPixelFormatRG32Sint         = 104,
   MTLPixelFormatRG32Float        = 105,

   MTLPixelFormatRGBA16Unorm      = 110,
   MTLPixelFormatRGBA16Snorm      = 112,
   MTLPixelFormatRGBA16Uint       = 113,
   MTLPixelFormatRGBA16Sint       = 114,
   MTLPixelFormatRGBA16Float      = 115,
   /* Ordinary 128 bit formats */
   MTLPixelFormatRGBA32Uint       = 123,
   MTLPixelFormatRGBA32Sint       = 124,
   MTLPixelFormatRGBA32Float      = 125,
   /* Compressed formats. */
   /* PVRTC */
   MTLPixelFormatPVRTC_RGB_2BPP       = 160,
   MTLPixelFormatPVRTC_RGB_2BPP_sRGB  = 161,
   MTLPixelFormatPVRTC_RGB_4BPP       = 162,
   MTLPixelFormatPVRTC_RGB_4BPP_sRGB  = 163,
   MTLPixelFormatPVRTC_RGBA_2BPP      = 164,
   MTLPixelFormatPVRTC_RGBA_2BPP_sRGB = 165,
   MTLPixelFormatPVRTC_RGBA_4BPP      = 166,
   MTLPixelFormatPVRTC_RGBA_4BPP_sRGB = 167,
   /* ETC2 */
   MTLPixelFormatEAC_R11Unorm     = 170,
   MTLPixelFormatEAC_R11Snorm     = 172,
   MTLPixelFormatEAC_RG11Unorm    = 174,
   MTLPixelFormatEAC_RG11Snorm    = 176,
   MTLPixelFormatEAC_RGBA8        = 178,
   MTLPixelFormatEAC_RGBA8_sRGB   = 179,
   MTLPixelFormatETC2_RGB8        = 180,
   MTLPixelFormatETC2_RGB8_sRGB   = 181,
   MTLPixelFormatETC2_RGB8A1      = 182,
   MTLPixelFormatETC2_RGB8A1_sRGB = 183,

   MTLPixelFormatASTC_4x4_sRGB    = 186,
   MTLPixelFormatASTC_5x4_sRGB    = 187,
   MTLPixelFormatASTC_5x5_sRGB    = 188,
   MTLPixelFormatASTC_6x5_sRGB    = 189,
   MTLPixelFormatASTC_6x6_sRGB    = 190,
   MTLPixelFormatASTC_8x5_sRGB    = 192,
   MTLPixelFormatASTC_8x6_sRGB    = 193,
   MTLPixelFormatASTC_8x8_sRGB    = 194,
   MTLPixelFormatASTC_10x5_sRGB   = 195,
   MTLPixelFormatASTC_10x6_sRGB   = 196,
   MTLPixelFormatASTC_10x8_sRGB   = 197,
   MTLPixelFormatASTC_10x10_sRGB  = 198,
   MTLPixelFormatASTC_12x10_sRGB  = 199,
   MTLPixelFormatASTC_12x12_sRGB  = 200,

   MTLPixelFormatASTC_4x4_LDR     = 204,
   MTLPixelFormatASTC_5x4_LDR     = 205,
   MTLPixelFormatASTC_5x5_LDR     = 206,
   MTLPixelFormatASTC_6x5_LDR     = 207,
   MTLPixelFormatASTC_6x6_LDR     = 208,
   MTLPixelFormatASTC_8x5_LDR     = 210,
   MTLPixelFormatASTC_8x6_LDR     = 211,
   MTLPixelFormatASTC_8x8_LDR     = 212,
   MTLPixelFormatASTC_10x5_LDR    = 213,
   MTLPixelFormatASTC_10x6_LDR    = 214,
   MTLPixelFormatASTC_10x8_LDR    = 215,
   MTLPixelFormatASTC_10x10_LDR   = 216,
   MTLPixelFormatASTC_12x10_LDR   = 217,
   MTLPixelFormatASTC_12x12_LDR   = 218,

   MTLPixelFormatGBGR422          = 240,
   MTLPixelFormatBGRG422          = 241,

   MTLPixelFormatDepth32Float     = 252,
   MTLPixelFormatStencil8         = 253,
} MTLPixelFormat;

ドキュメントには

The organization of color, depth, or stencil data storage in individual pixels of a MTLTexture object. There are three varieties of pixel formats: ordinary, packed, and compressed. The name of the pixel format specifies the order of components (for example, R, RG, RGB, RGBA, BGRA), bit depth per component (such as 8, 16, 32), and data type for the component (such as Half, Float, Sint, Snorm, Uint, Unorm).

と記載があります。

8, 16, 32は普通にbit数ですが、型の情報もあり、それがHalf, Float, Sint, Snorm, Uint, Unormです。
normが最初なにか分からなかったんですが、どうやらこれは以下のような意味っぽいです。

こちらの記事（norm and unorm in C++ AMP）から引用させてもらいました。

norm and unorm are wrappers over “float” and provide clamping behavior. norm and unorm clamp a floating point value into the range [-1.0, 1.0] and [0.0, 1.0] respectively.

Snormは-1.0 - 1.0に、Unormは0.0 - 1.0にクランプした値になる、ということのようです。
グラフィックスは-1.0 - 1.0の範囲で値を使うことが多いので使われているんでしょうね。

[Metal] テクスチャ（MTLTexture）を使う