0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

Deep Learning Specialization (Coursera) 自習記録 (C4W3)

Last updated at Posted at 2020-07-05

はじめに

Deep Learning Specialization の Course 4, Week 3 (C4W3) の内容です。

(C4W3L01) Object locatlization

内容

  • What are localization and detection?
    • classification ; 何か?
    • localization ; どこにあるか? (1 object)
    • detection; どこにあるか? (multiple objects)
  • localization の output
    • classification の出力 (何か?)
    • bounding box ($b_x, b_y, b_h, b_w$)
  • locatiozation の output をより具体的に
    • $p_c$ ; is there any object? (yes; 1, no; 0)
    • $b_x$
    • $b_y$
    • $b_h$
    • $b_w$
    • $c_1$ ; is it pedestrian?
    • $c_2$ ; is it car?
    • $c_3$ ; is it motorcycle?
  • $p_c = 0$ (何もない) ときは,それ以外のパラメタは不定
  • Loss fuction ; $L(\hat{y}, y)$
    • $\sum_{i=1}^{8} (\hat{y}_i - y_i)^2$ (if y1=1)
    • $(\hat{y}_1 - y_1)^2$ (if y1=0)

(C4W4L02) Landmark detection

内容

  • 顔であれば目尻や顔の輪郭,体であれば手足の位置などを検知する

(C4W3L03) Object detection

内容

  • sliding windows detection の説明
  • まじめに window を slide すると,computation cost が大きくなる

(C4W4L04) Convolutional implementation of sliding window

内容

  • sliding windows detection を,convolutional neural network 内に実装する方法の説明
  • 途中の MAX POOL の s (stride) が window を slide させる幅に対応する (と思う)

(C4W4L05) ビデオ無し

  • YouTube に映像がありませんでした

(C4W4L06) Intersection over union

内容

  • 2 つの bounding box が overlap しているとき,両者の積 (intersection) を両社の和 (union) で割ったものを IoU という
  • IoU $\ge$ 0.5 なら習慣的に bounding box は correct と判断する

(C4W3L07) Non-max suppression

内容

  • 各物体を 1 個だけ認識することを保証するアルゴリズム
  • discard all boxes with $p_c \le 0.6$
  • while there are any remaining boxes:
    • pick the box with the largest $p_c$, output that as a prediction
    • discard any remaining box with IoU $\ge 0.5$ with the box output in the previous step

(C4W3L08) Anchor boxes

内容

  • 1 つのセルで複数の物体を認識したいときに,2 つの anchor box を設定する
  • Previously:
    • Each object for training image is assigned to grid sell that contains that object's mid-point
  • With two anchor box:
    • Each object in training image is assigned to grid cell that contains object's mid-point and anchor box for the grid cell with highest IoU

(C4W3L09) Putting it together: YOLO algorithm

内容

  • $y$ ; 3 x 3 x 2 (#anchor) x 8 (5 + #classes)
    • 3 x 3 のグリッドを想定したが,一般的には 19 x 19 とか
    • 5 + #classes ; $p_c$, $b_x$, $b_y$, $b_h$, $b_w$, $c_1$, $c_2$, $c_3$
  • Outputting the non-max suppression output
    • For each grid call, get 2 prediction bounding box
    • Get rid of low probability predictions
    • For each class (pedestrian, car, motorcycle), use non-max suppression to generate final predictions

(C4W3L10) Region Proposal (Optional)

内容

  • Region Proposal : R-CNN
    • sliding window をすべてに適用させない
    • 領域分割して,何かありそうなところだけ適用する
    • Segmentational algorithm ($\sim$ 2000)
  • R-CNN ; Propose regions. Classify proposed regions once at a time. Output label + bounding box.
  • Fast R-CNN ; Propose regions. Use convolution implementation of sliding window to classify all the proposed regions.
  • Faster R-CNN ; Use convolutional network to propose regions.

参考

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?