ARMeshAnchorからタップした箇所のARMeshClassificationを取得する

Last updated at 2020-04-03Posted at 2020-03-29

ARKit 3.5とLiDAR搭載の新型iPad Proが出ましたね。これらを試せるAppleの公式サンプル「Visualizing and Interacting with a Reconstructed Scene」が出ています。

どんな感じのサンプルかはこちらのツイートの動画がよくわかります。

First look at the iPad Pro LiDAR Scanner pic.twitter.com/kwkl1YBy2n
— Tim Field (@nobbis) March 25, 2020

このサンプルでは、タップした箇所のClassification結果が3Dテキストで可視化されるようになっています。

Classificationというのはそのメッシュが壁なのか床なのかテーブルなのか、という分類を示す情報で、ARMeshClassificationというenumで定義されています。

public enum ARMeshClassification : Int {
   case none = 0
   case wall = 1
   case floor = 2
   case ceiling = 3
   case table = 4
   case seat = 5
   case window = 6
   case door = 7
}

公式サンプルでは、画面（ARView）タップでraycastによる（.estimatedPlaneとの）当たり判定を行い、classificationの判定結果をテキスト表示しているのですが、

if let result = arView.raycast(from: tapLocation, allowing: .estimatedPlane, alignment: .any).first {
   let resultAnchor = AnchorEntity(world: result.worldTransform)
   ...
}

ここらへんのコードが実はおもしろくて、Scene Reconstructionの有効化やメッシュの可視化はめちゃくちゃ簡単にできるようになっているのでClassificationの結果もARMeshAnchorのプロパティから取り出して終わりでしょ、と思いきや、そんなプロパティはなく、意外とめんどくさい処理をやっています。

具体的にはARMeshAnchorのgeometryプロパティに入っているARMeshGeometryオブジェクトのclassificationプロパティから取り出しているのですが、これがまだARMeshClassification型ではなくて、ARGeometrySource型なのです。

var vertices: ARGeometrySource

で、ARGeometrySourceというのが割とめんどくさいクラスで、そのデータをMTLBufferというMetalのバッファに保持しています。

open class ARGeometrySource : NSObject, NSSecureCoding {

   
   /**
    A Metal buffer containing per-vector data for the source.
    */
   open var buffer: MTLBuffer { get }

   
   ...
}

サンプルではこのバッファを（GPUではなく）CPUでほじくり、とりだした生の数値からARMeshClassificationを初期化しています。

extension ARMeshGeometry {
    ...
    
    /// To get the mesh's classification, the sample app parses the classification's raw data and instantiates an
    /// `ARMeshClassification` object. For efficiency, ARKit stores classifications in a Metal buffer in `ARMeshGeometry`.
    func classificationOf(faceWithIndex index: Int) -> ARMeshClassification {
        guard let classification = classification else { return .none }
        assert(classification.format == MTLVertexFormat.uchar, "Expected one unsigned char (one byte) per classification")
        let classificationPointer = classification.buffer.contents().advanced(by: classification.offset + (classification.stride * index))
        let classificationValue = Int(classificationPointer.assumingMemoryBound(to: CUnsignedChar.self).pointee)
        return ARMeshClassification(rawValue: classificationValue) ?? .none
    }

タップ位置に近いメッシュのfaceを抽出

前項の処理で出てきたclassificationOf(faceWithIndex:)メソッドではfaceのindexを引数に渡し、そのfaceについてのclassifiation結果を取り出しています。

メッシュの中にたくさんある中で、タップした位置に近いfaceを取り出してそのclassification結果を可視化しているわけです。

その「タップした位置に近い（5cm以内）場所にあるfaceを取り出す」実装はこうなっています。

for anchor in meshAnchors {
    for index in 0..<anchor.geometry.faces.count {
        // Get the center of the face so that we can compare it to the given location.
        let geometricCenterOfFace = anchor.geometry.centerOf(faceWithIndex: index)
        
        // Convert the face's center to world coordinates.
        var centerLocalTransform = matrix_identity_float4x4
        centerLocalTransform.columns.3 = SIMD4<Float>(geometricCenterOfFace.0, geometricCenterOfFace.1, geometricCenterOfFace.2, 1)
        let centerWorldPosition = (anchor.transform * centerLocalTransform).position
         
        // We're interested in a classification that is sufficiently close to the given location––within 5 cm.
        let distanceToFace = distance(centerWorldPosition, location)
        if distanceToFace <= 0.05 {
            ...
        }
    }
}

この計算をするためには、ARMeshGeometryのverticesプロパティとfacesプロパティを使用する必要があり、これまたARGeometrySource型。Metalバッファから数値を読み出すために、ARMeshGeometryのextensionとして次のようなメソッドが実装されています。

extension ARMeshGeometry {
    func vertex(at index: UInt32) -> (Float, Float, Float) {
        assert(vertices.format == MTLVertexFormat.float3, "Expected three floats (twelve bytes) per vertex.")
        let vertexPointer = vertices.buffer.contents().advanced(by: vertices.offset + (vertices.stride * Int(index)))
        let vertex = vertexPointer.assumingMemoryBound(to: (Float, Float, Float).self).pointee
        return vertex
    }

    ...
    
    func vertexIndicesOf(faceWithIndex faceIndex: Int) -> [UInt32] {
        assert(faces.bytesPerIndex == MemoryLayout<UInt32>.size, "Expected one UInt32 (four bytes) per vertex index")
        let vertexCountPerFace = faces.indexCountPerPrimitive
        let vertexIndicesPointer = faces.buffer.contents()
        var vertexIndices = [UInt32]()
        vertexIndices.reserveCapacity(vertexCountPerFace)
        for vertexOffset in 0..<vertexCountPerFace {
            let vertexIndexPointer = vertexIndicesPointer.advanced(by: (faceIndex * vertexCountPerFace + vertexOffset) * MemoryLayout<UInt32>.size)
            vertexIndices.append(vertexIndexPointer.assumingMemoryBound(to: UInt32.self).pointee)
        }
        return vertexIndices
    }
    
    func verticesOf(faceWithIndex index: Int) -> [(Float, Float, Float)] {
        let vertexIndices = vertexIndicesOf(faceWithIndex: index)
        let vertices = vertexIndices.map { vertex(at: $0) }
        return vertices
    }
    
    func centerOf(faceWithIndex index: Int) -> (Float, Float, Float) {
        let vertices = verticesOf(faceWithIndex: index)
        let sum = vertices.reduce((0, 0, 0)) { ($0.0 + $1.0, $0.1 + $1.1, $0.2 + $1.2) }
        let geometricCenter = (sum.0 / 3, sum.1 / 3, sum.2 / 3)
        return geometricCenter
    }
}

ARMeshGeometryのMTLBufferオブジェクトに格納されたデータをCPUから取り出す実装はMetalや3Dデータの扱いに不慣れなiOSデベロッパは多いと思われ、このサンプルは非常に貴重です。

ARMeshAnchorからタップした箇所のARMeshClassificationを取得する

タップ位置に近いメッシュのfaceを抽出

関連