物体検出結果を高速に描画します

Framework	（秒）
UIImage	0.5
CGImage	0.04

検出よりも描画がボトルネックになる

Yoloなどをモバイル用にした物体検出モデルは、iPhone11で0.02のオーダーで実行できますが、結果を下記の自分の記事のような方法でUIImageに描画すると、描画の処理が0.5秒ほどかかります。

つまり、
描画処理に、検出自体の２５倍の時間がかかってしまいました。
特にラベルテキストの描画が処理の80%を食います。

これは動画など大量のフレームの物体検出処理では致命的です。

CGImageで処理すれば１０倍高速

CGContextを使ってCGImageに描画すれば、同じ環境で0.04秒で描画でき、
UIImageに描画するよりも１０倍以上高速です。

func drawRectOnImage(_ detections: [Detection], _ image: CIImage) -> CIImage? {
    let cgImage = ciContext.createCGImage(image, from: image.extent)!
    let size = image.extent.size
    guard let cgContext = CGContext(data: nil,
                                    width: Int(size.width),
                                    height: Int(size.height),
                                    bitsPerComponent: 8,
                                    bytesPerRow: 4 * Int(size.width),
                                    space: CGColorSpaceCreateDeviceRGB(),
                                    bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue) else { return image }
    cgContext.draw(cgImage, in: CGRect(origin: .zero, size: size))
    for detection in detections {
        let invertedBox = CGRect(x: detection.box.minX, y: size.height - detection.box.maxY, width: detection.box.width, height: detection.box.height)
        if let labelText = detection.label {
            cgContext.textMatrix = .identity
                
            let text = "\(labelText) : \(round(detection.confidence*100))"
                
            let textRect  = CGRect(x: invertedBox.minX + size.width * 0.01, y: invertedBox.minY - size.width * 0.01, width: invertedBox.width, height: invertedBox.height)
            let textStyle = NSMutableParagraphStyle.default.mutableCopy() as! NSMutableParagraphStyle
                
            let textFontAttributes = [
                NSAttributedString.Key.font: UIFont.systemFont(ofSize: textRect.width * 0.1, weight: .bold),
                NSAttributedString.Key.foregroundColor: detection.color,
                NSAttributedString.Key.paragraphStyle: textStyle
            ]
                
            cgContext.saveGState()
            defer { cgContext.restoreGState() }
            let astr = NSAttributedString(string: text, attributes: textFontAttributes)
            let setter = CTFramesetterCreateWithAttributedString(astr)
            let path = CGPath(rect: textRect, transform: nil)
                
            let frame = CTFramesetterCreateFrame(setter, CFRange(), path, nil)
            cgContext.textMatrix = CGAffineTransform.identity
            CTFrameDraw(frame, cgContext)
                
            cgContext.setStrokeColor(detection.color.cgColor)
            cgContext.setLineWidth(9)
            cgContext.stroke(invertedBox)
        }
    }
    guard let newImage = cgContext.makeImage() else { return image }
    return CIImage(cgImage: newImage)
}

Yolov5検出と描画のサンプルコード

🐣

フリーランスエンジニアです。
お仕事のご相談こちらまで
rockyshikoku@gmail.com

Core MLやARKitを使ったアプリを作っています。
機械学習／AR関連の情報を発信しています。

Twitter
Medium
GitHub

物体検出ボックスを高速に描画する【Swift１０倍高速バージョン】

物体検出結果を高速に描画します

検出よりも描画がボトルネックになる

CGImageで処理すれば１０倍高速