More than 3 years have passed since last update.

矩形（長方形）データの扱い方をOpenCVなどのライブラリで比較しました。

Last updated at 2020-05-21Posted at 2018-01-01

画像関係のライブラリを複数使っていると矩形（長方形）の扱いがライブラリに異なっていることで注意がいる。

それぞれのライブラリの関数の引数と戻り値を型がどうなっているかを整理しよう。
誤解の生じない変数名を使うことで、コードを読みやすくしよう。

そのため、どのような流儀で矩形データが扱われているかを整理する。
普段の利用言語がPythonなのでPython上での流儀が記述の大半になります。

Type1: (x, y, w, h)に対応するもの

例 cv::Rect(x,y,w,h)
https://docs.opencv.org/3.4.0/dc/d84/group__core__basic.html#ga11d95de507098e90bad732b9345402e8
cv::Rectのデータメンバーは
.x,
.y,
.width
.height
の４つがある。
cv::Rectには
データメンバーの型の違いによって
cv::Rect2i
cv::Rect2d
cv::Rect2f

とがある。

部分画像の設定には
cv::Rect roi(x, y, w, h);
cv::Mat dst_img = img.clone();
cv::Mat s_roi = img(roi);

Pythonにはcv2.Rectはない。[x,y,w,h]のリストが代わりに用いられる。

cv2.CascadeClassifier.detectMultiScale()
の戻り値は、[x,y,w,h]のリストになる。
https://docs.opencv.org/3.0-beta/modules/objdetect/doc/cascade_classification.html#cascadeclassifier-detectmultiscale

x,y,w,h = cv2.boundingRect(cnt)

しかし描画関数では次のように２点を
cv2.rectangle(img,(left,top),(right,bottom),(0,255,0),3)

rect = matplotlib.patches.Rectangle((d.left(), d.top()), d.width(), d.height())

OpenCV の場合、bounding box という表現をしていても [x, y, w, h]
の意味である。

追跡クラスに矩形を指定する場合も[x,y,w,h]の意味になっています。

bbox = (287, 23, 86, 320)
tracker = cv2.TrackerKCF_create()
tracker.init(frame, bbox)

bool TrackerMIL::initImpl( const Mat& image, const Rect2d& boundingBox )

部分画像を設定するときには
img[ytop:ybottom, xleft:xright,:]
の部分には整数型を指定する必要がある。
1.0 という値は、小数点型であるので、範囲の指定に用いることはできない。

type2: (left, top, right, bottom)

PIL.Image.crop(box=None)
box – The crop rectangle, as a (left, upper, right, lower)-tuple.

from PIL import Image
im = Image.open('data/src/lena.jpg')
im_crop = im.crop((left, top, right, bottom))

画像のcropをするときは、いろんな考え方がある。

crop後の画像の大きさを指定したサイズを確保しようとするもの。
- 顔画像を正規化しようとする状況では、このやり方が望ましい。目位置で正規化した画像を作るのに適する。原画像がない範囲に対する扱いは、数通りの扱い方がある。
原画像のある範囲に読み替えて確保しようとするもの。
- 原画像にない領域の画素値にアクセスしようとしないことが確約される。

type3: dlibのオブジェクト検出値の戻り値

d.left()
d.top()
d.right()
d.bottom()

次のような記述が可能
[d.left(), d.top(), d.right()-d.left(), d.bottom()-d.top()] for d in dets

http://dlib.net/python/index.html#dlib.rectangle
http://dlib.net/python/index.html#dlib.rectangles
http://dlib.net/python/index.html#dlib.drectangle
http://dlib.net/python/index.html#dlib.full_object_detection
http://dlib.net/python/index.html#dlib.full_object_detections

dlibのrectangleの型を生成するには、次のようにします。

dlib.rectangle(long(left), long(top), long(right), long(bottom))

dlib.rectangle　のデータ形式は、dlibの顔位置の正規化や、顔の器官点導出に必要になります。顔画像の正規化をすれば顔照合の利用も簡単になります。

OpenCVでの顔検出の結果を利用した場合、顔追跡の結果を利用した場合など、
他の方法で顔位置を求めた結果を利用して顔器官点の位置を求める場合には、他の形式での矩形をdlib.rectangleの型に変換します。

type4: ((left, top), (right, bottom))

描画関数
左上の座標と、右下の座標の座標とを指定する。

rectangle(...)
    rectangle(img, pt1, pt2, color[, thickness[, lineType[, shift]]]) -> img
    .   @brief Draws a simple, thick, or filled up-right rectangle.
    .   
    .   The function rectangle draws a rectangle outline or a filled rectangle whose two opposite corners
    .   are pt1 and pt2.
    .   
    .   @param img Image.
    .   @param pt1 Vertex of the rectangle.
    .   @param pt2 Vertex of the rectangle opposite to pt1 .
    .   @param color Rectangle color or brightness (grayscale image).
    .   @param thickness Thickness of lines that make up the rectangle. Negative values, like CV_FILLED ,
    .   mean that the function has to draw a filled rectangle.
    .   @param lineType Type of the line. See the line description.
    .   @param shift Number of fractional bits in the point coordinates.

Deep learningの顔検出の検出後のw, h は同じ値となっていないことに注意。

PIL.ImageDraw.Draw.rectangle(xy, fill=None, outline=None)
xy – Four points to define the bounding box. Sequence of either [(x0, y0), (x1, y1)] or [x0, y0, x1, y1]. The second point is just outside the drawn rectangle.

cv::RotatedRect Class Reference

RotatedRect (const Point2f &center, const Size2f &size, float angle)

このデータ形式では、centerを指定する点で、他の矩形の指定の仕方とだいぶ異なっています。

type5: ((center_x, center_y), (width, height))

Yolo 形式のアノテーションファイルでの領域の指定は、この４つです。
Yolo 形式のアノテーションファイルでの領域の指定は、pixelではなく、画像全体の幅、高さに対する比率になっています。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up