๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
AI

[Object Detection] Fast R-CNN ์‚ดํŽด๋ณด๊ธฐ

by kaizen_bh 2025. 12. 1.

 

 

 

 

 

 

 

1. ๊ฐ์ฒด ํƒ์ง€ ํƒœ์Šคํฌ์˜ ๋ฌธ์ œ ์ •์˜ - Object Detection์˜ ์—ญ์‚ฌ



 

 

 

https://bh-kaizen.tistory.com/73

 

[Object Detection] R-CNN ์‚ดํŽด๋ณด๊ธฐ (+SPPNet)

๋‹ค์Œ ์ฃผ๋ถ€ํ„ฐ ๋””ํ…์…˜ ํ”„๋กœ์ ํŠธ๊ฐ€ ์ง„ํ–‰๋œ๋‹ค๊ทธ๋™์•ˆ ๋””ํ…์…˜์ชฝ ๋ชจ๋ธ๋“ค๋„ ํŒŒํŽธ์ ์œผ๋กœ ์•Œ๊ณ  ์žˆ๊ฑฐ๋‚˜ ์ดํ•ด๊ฐ€ ๋ถ€์กฑํ•œ ๋ถ€๋ถ„๋“ค์ด ๋งŽ์•˜๋Š”๋ฐ ์ด๋ฒˆ ๊ธฐํšŒ์— ์ „์ฒด์ ์œผ๋กœ ๊ณต๋ถ€ ๋ฐ ์ •๋ฆฌ๋ฅผ ํ•˜๊ณ  ๋„˜์–ด๊ฐ€๋ คํ•œ๋‹ค ์•„๋ž˜

bh-kaizen.tistory.com

 

 

์ด์ „์—๋Š” R-CNN๊ณผ SPPNet์„ ์‚ดํŽด๋ณด์•˜๋‹ค

๋‹ค์Œ์€ Fast R-CNN, Faster R-CNN์ธ๋ฐ Fast R-CNN์€ SPPNet์˜ ๋ณ€ํ˜•, ์ข€ ๋” ๊ฐœ์„ ๋œ ๋ฒ„์ ผ์— ๊ฐ€๊น๊ณ  Faster R-CNN์€ R-CNN์˜ ๋ฌธ์ œ์ ๋“ค์„ ๋Œ€๋ถ€๋ถ„ ๊ฐœ์„ ํ•œ end-to-end ํ•™์Šต ๊ฐ€๋Šฅํ•œ ๋””ํ…์…˜ ๋ชจ๋ธ์ด๋‹ค

 

R-CNN ๊ณ„์—ด์˜ ์ „๋ฐ˜์ ์ธ ํ๋ฆ„์€ ์•„๋ž˜ ๊ธ€์—์„œ ์ž˜ ์ •๋ฆฌ๋˜์–ด์žˆ๋‹ค

 

 

 

์ด๋ฒˆ์—๋„ ๋…ผ๋ฌธ์ด๋‚˜ ์ฝ”๋“œ ๋“ฑ์„ ๋ณด๊ธฐ ์ „์— ์ž˜ ์ •๋ฆฌ๋œ ๋ธ”๋กœ๊ทธ๊ธ€๋“ค๋กœ ์ „์ฒด์ ์ธ ๋‚ด์šฉ ๊ตฌ์„ฑ๊ณผ ์ค‘์š”ํ•œ ๋‚ด์šฉ, ๊ฐœ๋…๋“ค์„ ๋จผ์ € ์‚ดํŽด๋ณด์•˜๋‹ค

 

 

 

๋ฌผ๋ก .. ์ •๋ฆฌ๋œ ๊ธ€์„ ๋ด๋„ ํ•œ๋ฒˆ์— ์ดํ•ด๋˜์ง„ ์•Š์•˜๋‹ค

๊ทธ๋ž˜์„œ ์ œ๋ฏธ๋‚˜์ด์—๊ฒŒ ๋…ผ๋ฌธ์„ ๋˜์ ธ์ฃผ๊ณ  ๊ฐ€์ด๋“œ ํ•™์Šต ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•ด์„œ ์ˆœ์„œ๋Œ€๋กœ ์ฃผ์š” ๋‚ด์šฉ๋“ค์— ๋Œ€ํ•ด ์งˆ๋ฌธ์„ ๋˜์ ธ์ฃผ๊ฒŒ ํ•ด์„œ ์—ฌ๊ธฐ์— ๋Œ€๋‹ตํ•ด๋ณด๋Š” ์‹์œผ๋กœ ์กฐ๊ธˆ์”ฉ ์ดํ•ดํ•˜๊ณ  ๋‹ค์‹œ ๋ธ”๋กœ๊ทธ๊ธ€๋กœ ๋Œ์•„๊ฐ€์„œ ํ™•์ธํ•˜๊ณ  ์ดํ•ด๊ฐ€ ๋˜์—ˆ๋Š”์ง€ ์ฒดํฌํ•˜๋ฉฐ ๊ณต๋ถ€ํ•ด๋ณด์•˜๋‹ค

 

์ œ๋ฏธ๋‚˜์ด ๊ฐ€์ด๋“œ ํ•™์Šต ์‚ฌ์šฉ ์˜ˆ์‹œ.

 

 


 

 

 

 

 

 

Fast R-CNN

 

 

๊ธฐ์กด ๋ชจ๋ธ์˜ ํ•œ๊ณ„์  (R-CNN, SPPNet)

 

 

๋…ผ๋ฌธ์—์„œ๋Š” ์ฒซ ํŽ˜์ด์ง€์—์„œ ์œ„์™€ ๊ฐ™์ด R-CNN์™€ SPPNet์— ๋Œ€ํ•œ 3๊ฐ€์ง€ ๋ฌธ์ œ์ ๋“ค์„ ๋จผ์ € ์ง€์ ํ•œ๋‹ค

  • ํ•™์Šต์ด ๋‹ค๋‹จ๊ณ„๋กœ ์ด๋ฃจ์–ด์ ธ ๊ฐ๊ฐ ํ•™์Šต๋˜๋Š” ๊ฒƒ (=end-to-end ํ•™์Šต ๋ถˆ๊ฐ€)
  • ๊ทธ๋ฆฌ๊ณ  ๊ฐ ๋‹จ๊ณ„๋ณ„๋กœ ํ•™์Šต์„ ์‹œํ‚ค๋Š”๋ฐ ํ•™์Šต ๋ฆฌ์†Œ์Šค๊ฐ€ ๋งŽ์ด ๋“œ๋Š” ๊ฒƒ
  • ์ธํผ๋Ÿฐ์Šค ์†๋„๊ฐ€ ๋А๋ฆฌ๋‹ค๋Š” ๊ฒƒ

SPPNet์€ SPP๋ ˆ์ด์–ด๋ฅผ ํ†ตํ•ด ์—ฌ๋Ÿฌ ๋ฌธ์ œ๋ฅผ ๊ฐœ์„ ํ–ˆ์œผ๋‚˜ ์—ฌ์ „ํžˆ ํ•œ๊ณ„์ ๋“ค์ด ์กด์žฌํ•œ๋‹ค

  • R-CNN๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ํŠน์ง• ์ถ”์ถœ, ํŒŒ์ธ ํŠœ๋‹, SVM ํ•™์Šต, Bounding-box regressor ํ•™์Šต์œผ๋กœ ์ด์–ด์ง€๋Š” ๋‹ค๋‹จ๊ณ„ ํŒŒ์ดํ”„๋ผ์ธ
  • ์—ฌ์ „ํžˆ Feature๋ฅผ ๋””์Šคํฌ์— ์ €์žฅํ•ด์•ผ ํ•จ.
     
  • ์น˜๋ช…์ ์ธ ํ•œ๊ณ„: ์ œ์•ˆ๋œ ํŒŒ์ธ ํŠœ๋‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด SPP(Spatial Pyramid Pooling) ๋ ˆ์ด์–ด ์•ž๋‹จ์˜ Convolutional layer๋“ค์„ ์—…๋ฐ์ดํŠธํ•  ์ˆ˜ ์—†์Œ.
  • ์ด๋Ÿฌํ•œ '๊ณ ์ •๋œ Convolutional layer(Fixed convolutional layers)'์˜ ํ•œ๊ณ„๋Š” ๋งค์šฐ ๊นŠ์€ ๋„คํŠธ์›Œํฌ(Very deep networks)์˜ ์ •ํ™•๋„๋ฅผ ์ œํ•œํ•จ.

 

๊ณตํ†ต์ ์œผ๋กœ R-CNN๊ณผ SPPNet์€ ํŠน์ง• ์ถ”์ถœ (CNN) ๊ณผ ๋ถ„๋ฅ˜/ํšŒ๊ท€ ๋ชจ๋ธ ๊ฐ„์— ํ•™์Šต์ด ๋‹จ์ ˆ๋˜์–ด ์žˆ๋‹ค๋Š” ์ ์ด๋‹ค

์ผ๋ฐ˜์ ์ธ ํ•˜๋‚˜์˜ ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ ์˜ˆ์ธก๊ฐ’๊ณผ ์ •๋‹ต๊ฐ’ ์‚ฌ์ด์˜ ์˜ค์ฐจ๋ฅผ ํ†ตํ•ด ๊ฐ€์ค‘์น˜๋“ค์„ ์กฐ์ •ํ•˜๋ฉด์„œ ์ •๋‹ต์— ๊ฐ€๊น๊ฒŒ ํ•™์Šต์„ ํ•˜์ง€๋งŒ ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ํšŒ๊ท€ ๋ชจ๋ธ๊ณผ ํด๋ž˜์Šค๋ฅผ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ถ„๋ฅ˜ ๋ชจ๋ธ์—์„œ ๋ฐœ์ƒํ•œ ์˜ค์ฐจ๋กœ CNN์˜ ํ”ผ์ณ๋“ค์„ ๋” ์ข‹๊ฒŒ ์ˆ˜์ •ํ•  ์ˆ˜๊ฐ€ ์—†๋‹ค

๊ทธ๋Ÿฌ๋‚˜ ํ˜„์žฌ์˜ ๊ตฌ์กฐ์—์„œ๋Š” CNN์˜ ๊ฐ€์ค‘์น˜๋“ค์ด ๋ถ„๋ฅ˜๊ธฐ์˜ ์ตœ์ ํ™” ๊ณผ์ •์— ๋งž์ถฐ ์—…๋ฐ์ดํŠธ ๋˜์ง€ ๋ชปํ•œ๋‹ค

 

์ด๋ฏธ์ง€์˜ ํŠน์ง•์„ ์ž˜ ์ถ”์ถœํ•ด์•ผ CNN์„ ๊ฑฐ์นœ ํ”ผ์ณ๋งต์—์„œ ํˆฌ์˜ํ•œ RoI ๋ฒกํ„ฐ๋“ค์ด ๋” ์ •ํ™•ํ•˜๊ฒŒ ๋ถ„๋ฅ˜๋˜๊ณ  ์ขŒํ‘œ๊ฐ’๋“ค๋„ ์ž˜ ์กฐ์ •๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค

์ด๋ฅผ ์ข…ํ•ฉํ•ด์„œ SPPNet๊ณผ Fast R-CNN์„ ๋จผ์ € ๋น„๊ตํ•ด์„œ ๋ณด๋ฉด ์•„๋ž˜์™€ ๊ฐ™๋‹ค

 

 

https://velog.io/@woojinn8/Object-Detection-2.-SPP-Net-FastFaster-R-CNN

  • Fast R-CNN์—์„œ Region Proposal ๋ถ€๋ถ„์€ ์—ฌ์ „ํžˆ ํ†ตํ•ฉ๋˜์ง€ ๋ชปํ–ˆ์œผ๋‚˜ ๊ทธ ์™ธ์— ๋‹ค๋ฅธ ๋ถ€๋ถ„๋“ค์€ ์ „๋ถ€ ํ•˜๋‚˜์˜ ๋„คํŠธ์›Œํฌ๋กœ ํ†ตํ•ฉ๋˜์–ด ์žˆ์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค
  • ์ฃผ๋กœ ์‚ดํŽด๋ณผ ๋‚ด์šฉ์€ SPP์˜ ๋ณ€ํ˜•์ธ RoI Pooling, ๋ถ„๋ฅ˜์™€ ํšŒ๊ท€๋ฅผ ๋™์‹œ์— ํ•™์Šต ๊ฐ€๋Šฅํ•˜๊ฒŒ ๋งŒ๋“ค์–ด์ค€ Multi-task loss, 2๊ฐ€์ง€ ํ•ต์‹ฌ ์•„์ด๋””์–ด์ด๋‹ค
  • ๊ทธ ์™ธ์—๋„ RoI Pooling์„ ํ†ตํ•œ ํšจ๊ณผ์™€ ์„ฑ๋Šฅ ํ–ฅ์ƒ, FC Layer๋ฅผ ์ตœ์ ํ™”ํ•œ ๋ฐฉ๋ฒ•๋“ค๋„ ๊ฐ„๋‹จํ•˜๊ฒŒ ๊ฐ™์ด ์‚ดํŽด๋ณด์•˜๋‹ค

 

 

์•„๋ž˜๋Š” ๋…ผ๋ฌธ์—์„œ ์ œ์‹œํ•œ Fast R-CNN์˜ ๊ตฌ์กฐ์ด๋‹ค

  • RoI Pooling layer๋ฅผ ํ†ต๊ณผํ•ด์„œ ๊ฐ๊ฐ ๋ถ„๋ฅ˜์™€ ํšŒ๊ท€ ๋ ˆ์ด์–ด๋กœ ์ „๋‹ฌ๋œ๋‹ค๊ณ  ํ•˜๋Š”๋ฐ ์ด RoI Pooling์ด ๋ญ˜๊นŒ?

 

 

RoI Pooling

 

RoI Pooling ๋ ˆ์ด์–ด๋Š” Max Pooling์„ ์‚ฌ์šฉํ•ด ์…€๋ ‰ํ‹ฐ๋ธŒ ์„œ์น˜๋กœ ์–ป์€ ํ›„๋ณด๋“ค์„ ์›๋ณธ ์ด๋ฏธ์น˜๊ฐ€ CNN์„ ๊ฑฐ์ณ ์–ป์€ ํ”ผ์ณ๋งต์— RoI Projectionํ•˜์—ฌ ์–ป์€ RoI๋“ค์„ ๊ณ ์ •๋œ ํฌ๊ธฐ H x W (ex: 7 x 7) ์˜ ์ž‘์€ ํ”ผ์ณ๋งต์œผ๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค

์ด์ „ SPPNet์—์„œ spatial pyramid pooling์˜ ๊ฒฝ์šฐ, ๋‹ค์–‘ํ•œ ํฌ๊ธฐ์˜ ๋นˆ (1x1, 2x2, 4x4) ๊ณผ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ํฌ๊ธฐ์˜ ํ”ผ์ณ๋งต์„ ๋ฝ‘์•˜๋‹ค๋ฉด RoI Pooling์€ ํ•˜๋‚˜์˜ ๊ณ ์ •๋œ ํฌ๊ธฐ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด๋‹ค

 

 

๋™์ž‘ ์›๋ฆฌ

  • ๊ณ ์ • ํฌ๊ธฐ ๋ณ€ํ™˜: RoI Pooling์€ ์ž…๋ ฅ Feature Map ์œ„์— ์žˆ๋Š” ํฌ๊ธฐ๊ฐ€ ๊ฐ€๋ณ€์ ์ธ RoI ์˜์—ญ (h x w) ์„ ๋„คํŠธ์›Œํฌ๊ฐ€ ์š”๊ตฌํ•˜๋Š” ๊ณ ์ •๋œ H x W ํฌ๊ธฐ์˜ ๊ฒฉ์ž(Grid)๋กœ ๋‚˜๋ˆˆ๋‹ค
  • Max Pooling ์ ์šฉ: ์ด H x W ๊ฒฉ์ž์˜ ๊ฐ ์…€(Sub-window) ๋‚ด๋ถ€์—์„œ Max Pooling์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ์ฆ‰, ๊ฐ ์…€์—์„œ ๊ฐ€์žฅ ํฐ ๊ฐ’๋งŒ์„ ์ถ”์ถœํ•˜์—ฌ ์ตœ์ข…์ ์ธ H x W ํฌ๊ธฐ์˜ Feature Map์„ ๊ตฌ์„ฑํ•œ๋‹ค
  • ๊ฒฐ๊ณผ: ์ด ๊ณผ์ •์„ ํ†ตํ•ด RoI์˜ ํฌ๊ธฐ (h, w) ์™€ ๋ฌด๊ด€ํ•˜๊ฒŒ ํ•ญ์ƒ ์ผ์ •ํ•œ ๊ธธ์ด์˜ Feature Vector๋ฅผ ์–ป์–ด, ์ดํ›„์˜ FC Layer(Fully Connected Layer) ์— ์ž…๋ ฅํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋œ๋‹ค

 

https://frogbam07.tistory.com/28

 

  • ๊ฐ„๋‹จํ•œ ์˜ˆ๋ฅผ ๋“ค์–ด, ์›๋ณธ ํ”ผ์ณ๋งต ํฌ๊ธฐ๊ฐ€ 49x49๋กœ ๋‚˜์™”๊ณ  ํˆฌ์˜ํ•ด์„œ ์–ป์€ RoI ํฌ๊ธฐ๊ฐ€ 14x14์ด๋‹ค
  • ๊ทธ๋Ÿฌ๋ฉด h, w๋Š” 14,14์ด๊ณ  H, W๋Š” 7x7๋กœ ์„ค์ •ํ–ˆ์„ ๋•Œ
  • h/H, h/W => 14/7, 14/7์ด ๋˜์„œ ์„œ๋ธŒ ์œˆ๋„์šฐ๋Š” 2x2 ํฌ๊ธฐ๋ฅผ ๊ฐ€์ง„๋‹ค
  • ์ด ์„œ๋ธŒ ์œˆ๋„์šฐ์—์„œ Max Pool์„ ์ˆ˜ํ–‰ํ•˜์—ฌ ์ด 7x7 ํฌ๊ธฐ์˜ ํ”ผ์ณ๋งต์„ ๋งŒ๋“ ๋‹ค
  • ์˜ˆ์‹œ๋Š” ๊น”๋”ํ•˜๊ฒŒ ๋‚˜๋ˆ ์ง€๋Š” ๊ฒฝ์šฐ๋กœ ๋“ค์—ˆ์ง€๋งŒ ์‹ค์ œ RoI๋Š” ํฌ๊ธฐ๋“ค์ด ๋ชจ๋‘ ๋‹ค๋ฅด๊ณ  ๋‹ค์–‘ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋Œ€๋ถ€๋ถ„ ์ง์‚ฌ๊ฐํ˜•์ด๋ผ ์ƒ๊ฐํ•˜๋Š” ๊ฒƒ์ด ์ข‹๋‹ค
  • ๊ทธ๋ž˜์„œ ์œ„์˜ ์ด๋ฏธ์ง€ ์˜ˆ์‹œ๋„ ์ €๋Ÿฐ ์ง์‚ฌ๊ฐํ˜•์œผ๋กœ ๋˜์–ด์žˆ๋Š” ๊ฒƒ์ด๋‹ค
  • ๋˜ํ•œ, RoI Pooing์„ ํ•˜๊ธฐ ์œ„ํ•ด ์„œ๋ธŒ์œˆ๋„์šฐ๋ฅผ ๊ณ„์‚ฐํ•  ๋•Œ 32/7 ๊ฐ™์€ ๊น”๋”ํ•˜๊ฒŒ ๋‚˜๋ˆ ์ง€์ง€ ์•Š๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋Œ€๋ถ€๋ถ„์ด๋ฉฐ ์ด ๊ฒฝ์šฐ ๋‹จ์ˆœํ•œ  ์–‘์žํ™” (๋ฐ˜์˜ฌ๋ฆผ, ๋ฒ„๋ฆผ) ์„ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ํ•œ๋‹ค
  • ์ด๋ ‡๊ฒŒ ์ฒ˜๋ฆฌํ•  ๊ฒฝ์šฐ RoI์—์„œ ๊ฐ€์žฅ์ž๋ฆฌ๊ฐ€ ๊นŽ์—ฌ๋‚˜๊ฐ€๋ฉด์„œ ์ •๋ณด ์†์‹ค์ด ๋ฐœ์ƒํ•˜๊ณ , ์ด ๋ถ€๋ถ„์„ ์ถ”ํ›„ ๊ฐœ์„ ํ•œ ๋ฐฉ๋ฒ•์ด RoIAlign์ด๋‹ค

 

์•„๋ž˜์˜ ๊ธ€์—์„œ RoI Pooling๊ณผ ๊ทธ ๊ณผ์ •์—์„œ ํ”ฝ์…€๋“ค์ด ์†์‹ค๋˜๋Š” ๋‚ด์šฉ๊นŒ์ง€ ์ž˜ ๋‹ด๊ฒจ์žˆ์œผ๋‹ˆ ํ•œ๋ฒˆ ์ฝ์–ด๋ณด๋Š” ๊ฑธ ์ถ”์ฒœํ•œ๋‹ค

 

 

https://erdem.pl/2020/02/understanding-region-of-interest-ro-i-pooling

 

Understanding Region of Interest (RoI Pooling) - Blog by Kemal Erdem

Original Fast R-CNN architecture. Source: https://arxiv.org/pdf/1504.08083.pdf We’re going to discuss original RoI pooling described in Fast R-CNN paper (light blue rectangle on the image above). There is a second and a third version of that process call

erdem.pl

 

 

 

 

ํƒ์ง€๋ฅผ ์œ„ํ•œ ํŒŒ์ธ ํŠœ๋‹ (ํ•™์Šต ๋ฐฉ์‹ ๋ฐ ์—ญ์ „ํŒŒ)

 

RoI Pooling์ด Fast R-CNN์—์„œ end-to-end๋กœ ํ•™์Šตํ•˜๊ธฐ ์œ„ํ•œ ๋ฉ”์ธ ์•„์ด๋””์–ด ์ค‘ ํ•˜๋‚˜์ด๋‹ค

Fast R-CNN์˜ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๋Šฅ๋ ฅ์€ ๋ชจ๋“  ๋„คํŠธ์›Œํฌ ๊ฐ€์ค‘์น˜๋ฅผ ์—ญ์ „ํŒŒ(Back-propagation)๋กœ ํ•™์Šต์‹œํ‚ค๋Š” ๊ฒƒ์ด๋‹ค

    • SPPnet์ด ์‹คํŒจํ•œ ๊ทผ๋ณธ ์›์ธ: SPPnet์€ ๊ฐ ํ•™์Šต ์ƒ˜ํ”Œ(RoI)์ด ์„œ๋กœ ๋‹ค๋ฅธ ์ด๋ฏธ์ง€์—์„œ ์˜ฌ ๋•Œ SPP ๋ ˆ์ด์–ด๋ฅผ ํ†ตํ•œ ์—ญ์ „ํŒŒ๊ฐ€ ๋น„ํšจ์œจ์ ์ด์—ˆ๊ธฐ ๋•Œ๋ฌธ์ด์—ˆ๋‹ค. ์ด๋Š” R-CNN๊ณผ SPPnet์˜ ํ•™์Šต ๋ฐฉ์‹์ด์—ˆ๋‹ค
    • Fast R-CNN์˜ ํ•ด๊ฒฐ์ฑ… (๊ณ„์ธต์  ์ƒ˜ํ”Œ๋ง): Fast R-CNN์€ Feature ๊ณต์œ ์˜ ์ด์ ์„ ํ™œ์šฉํ•˜๋Š” ํšจ์œจ์ ์ธ ํ•™์Šต ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค.
      ๋ฏธ๋‹ˆ ๋ฐฐ์น˜๋ฅผ N๊ฐœ์˜ ์ด๋ฏธ์ง€๋ฅผ ๋จผ์ € ์ƒ˜ํ”Œ๋งํ•˜๊ณ  ๊ฐ ์ด๋ฏธ์ง€์—์„œ R/N๊ฐœ์˜ RoI๋ฅผ ์ƒ˜ํ”Œ๋งํ•˜๋Š” ๊ณ„์ธต์ (Hierarchical) ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•œ๋‹ค
      • SPPNet/R-CNN์˜ ๋ฌธ์ œ :
        ์ด๋“ค์€ 128๊ฐœ์˜ RoI๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด
        128์žฅ์˜ ์„œ๋กœ ๋‹ค๋ฅธ ์ด๋ฏธ์ง€์—์„œ RoI๋ฅผ 1๊ฐœ์”ฉ ์ถ”์ถœํ•˜์˜€๋‹ค
        ๋ชจ๋“  RoI๊ฐ€ ์„œ๋กœ ๋‹ค๋ฅธ ์ด๋ฏธ์ง€์—์„œ ์™”๊ธฐ ๋•Œ๋ฌธ์—, ConvNet์„ ํ†ต๊ณผํ•œ Feature Map์„ ์ „ํ˜€ ๊ณต์œ ํ•  ์ˆ˜ ์—†์—ˆ๊ณ  ๋งค RoI๋งˆ๋‹ค ์ƒˆ๋กœ์šด ์—ฐ์‚ฐ์ด ํ•„์š”ํ–ˆ๋‹ค
      • Fast R-CNN์˜ ํ•ด๊ฒฐ :
        2์žฅ์˜ ์ด๋ฏธ์ง€์—์„œ 64๊ฐœ์”ฉ RoI๋ฅผ ์ถ”์ถœํ•˜๋ฉด, ์ด 64๊ฐœ์˜ RoI๋Š” ๋‹จ ํ•˜๋‚˜์˜ Feature Map์„ ๊ณต์œ ํ•ฉ๋‹ˆ๋‹ค.
      • ๊ฒฐ๊ณผ: ์ด๋ฏธ ๊ณ„์‚ฐ๋œ Feature Map์„ 64๋ฒˆ ์žฌํ™œ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜๋ฏ€๋กœ, ๋…ผ๋ฌธ์—์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด R-CNN/SPPNet ๋ฐฉ์‹ ๋Œ€๋น„ ์•ฝ 64๋ฐฐ ๋” ๋น ๋ฅธ ๋ฏธ๋‹ˆ ๋ฐฐ์น˜ ์—ฐ์‚ฐ ์†๋„๋ฅผ ๋‹ฌ์„ฑ
      • ํšจ์œจ์„ฑ: ๊ฐ™์€ ์ด๋ฏธ์ง€์—์„œ ์˜จ RoI๋“ค์€ Forward/Backward pass์—์„œ ์—ฐ์‚ฐ๊ณผ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๊ณต์œ ํ•œ๋‹ค
  • RoI Pooling Layer๋ฅผ ํ†ตํ•œ ์—ญ์ „ํŒŒ:
    • ์›๋ฆฌ: ์—ญ์ „ํŒŒ๋Š” argmax ์Šค์œ„์น˜๋ฅผ ๋”ฐ๋ผ ๋ฏธ๋ถ„ ๊ฐ’(Derivatives)์„ ๋ผ์šฐํŒ…ํ•œ๋‹ค
    • CNN ๊ธฐ์šธ๊ธฐ ๊ณ„์‚ฐํ•  ๊ฒฝ์šฐ Max Pooling์—์„œ๋Š” ๊ฐ’์„ ์„ ํƒํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์—ญ์ „ํŒŒ์‹œ ์„ ํƒ๋œ ๊ฐ’์œผ๋กœ ํ˜๋Ÿฌ๊ฐ€๊ฒŒ ํ•˜๋Š”, ๊ธฐ์šธ๊ธฐ๋ฅผ ์ „๋‹ฌํ•˜๋Š” ์—ญํ• ์„ ํ•œ๋‹ค
    • ๋”ฐ๋ผ์„œ ์†์‹คํ•จ์ˆ˜ L์˜ ๋‘ ๊ธฐ์šธ๊ธฐ (partial L, partial yrj) ๋Š” RoI Pooling์‹œ Max ๊ฐ’์„ ์„ ํƒํ•˜๊ฒŒ ํ•œ ์ž…๋ ฅ Feature Map์˜ ํ”ฝ์…€ xi์—๋งŒ ๋ˆ„์ ๋˜์–ด ์ „๋‹ฌ๋˜๊ณ , ๋‚˜๋จธ์ง€ ํ”ฝ์…€์—๋Š” 0์ด ์ „๋‹ฌ๋œ๋‹ค

 

 

 

๋‹ค์‹œ ์ •๋ฆฌํ•ด์ž๋ฉด ์—ญ์ „ํŒŒ ์›๋ฆฌ (Backward): 

H x W ์ถœ๋ ฅ๊ฐ’์˜ ๊ธฐ์šธ๊ธฐ(partial L/partial y)๊ฐ€ ์–ด๋–ป๊ฒŒ ์ž…๋ ฅ Feature Map(x)์œผ๋กœ ์ „๋‹ฌ๋˜๋Š”๊ฐ€?

  • ํ•ต์‹ฌ ์›๋ฆฌ: argmax ์Šค์œ„์น˜ ํ™œ์šฉ: RoI Pooling์€ Max Pooling ๊ธฐ๋ฐ˜์ด๋ฏ€๋กœ, ์—ญ์ „ํŒŒ ์‹œ์—๋Š” Forward Pass์—์„œ Max ๊ฐ’์„ ์„ ํƒํ–ˆ๋˜ ๊ทธ ์œ„์น˜๋ฅผ '์Šค์œ„์น˜'์ฒ˜๋Ÿผ ๊ธฐ์–ตํ•œ๋‹ค

https://ratsgo.github.io/deep%20learning/2017/04/05/CNNbackprop/

 

  • ๊ธฐ์šธ๊ธฐ ์ „๋‹ฌ: ์†์‹ค ํ•จ์ˆ˜(L) ์—์„œ ๊ณ„์‚ฐ๋œ ๊ธฐ์šธ๊ธฐ(partial L / partial y_rj)๋Š” Max Pooling ์‹œ ์„ ํƒ๋˜์—ˆ๋˜ ์ž…๋ ฅ Feature Map์˜ ํ”ฝ์…€ x_i ์œ„์น˜๋กœ๋งŒ ๋ˆ„์ ๋˜์–ด(Accumulated) ์ „๋‹ฌ๋œ๋‹ค
  • End-to-End์˜ ์™„์„ฑ: ์ด argmax ์Šค์œ„์น˜ ๋•๋ถ„์— ๊ธฐ์šธ๊ธฐ๊ฐ€ RoI Pooling ๋ ˆ์ด์–ด๋ฅผ ํ†ต๊ณผํ•˜์—ฌ ๊ณต์œ ๋œ Convolutional Layer์˜ ๊ฐ€์ค‘์น˜๊นŒ์ง€ ๋„๋‹ฌํ•˜๊ฒŒ ๋œ๋‹ค

์ด๊ฒƒ์ด SPPnet์ด ํ•ด๊ฒฐํ•˜์ง€ ๋ชปํ–ˆ๋˜ ํ•™์Šต์˜ ๋‹จ์ ˆ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ  End-to-End ํ•™์Šต์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•œ ๊ฒฐ์ •์ ์ธ ๋ฉ”์ปค๋‹ˆ์ฆ˜์ด๋‹ค

 

 

CNN ์—ญ์ „ํŒŒ ๊ณผ์ •์— ๋Œ€ํ•ด์„œ ์•„๋ž˜ ๊ธ€์ด ์ž˜ ์„ค๋ช…ํ•ด์ฃผ์—ˆ๋‹ค

์—ฌ๋Ÿฌ๋ฒˆ ์ •๋…ํ•˜๋ฉด์„œ ๊ณต๋ถ€ํ•  ๋‚ด์šฉ..

 

https://ratsgo.github.io/deep%20learning/2017/04/05/CNNbackprop/

 

CNN์˜ ์—ญ์ „ํŒŒ(backpropagation) · ratsgo's blog

์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” Convolutional Neural Networks(CNN)์˜ ์—ญ์ „ํŒŒ(backpropagation)๋ฅผ ์‚ดํŽด๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ๋งŽ์ด ์“ฐ๋Š” ์•„ํ‚คํ…์ฒ˜์ด์ง€๋งŒ ๊ทธ ๋‚ด๋ถ€ ์ž‘๋™์— ๋Œ€ํ•ด์„œ๋Š” ์ œ๋Œ€๋กœ ์•Œ์ง€ ๋ชปํ•œ๋‹ค๋Š” ์ƒ๊ฐ์— ์ € ์Šค์Šค๋กœ๋„

ratsgo.github.io

 

 

 

 

Multi-task Loss

 

๋‹ค์Œ์€ end-to-end ํ•™์Šต์„ ์‹œ๋„ํ–ˆ๋˜ ๋‘๋ฒˆ์งธ ๋ฉ”์ธ ์•„์ด๋””์–ด์ด๋‹ค

๊ฐ„๋‹จํ•˜๊ฒŒ ๋งํ•˜์ž๋ฉด ๋ถ„๋ฅ˜, ํšŒ๊ท€์— ์‚ฌ์šฉํ•˜๋Š” ์˜ค์ฐจ๋ฅผ ํ•ฉ์ณ์„œ ํ•˜๋‚˜์˜ Loss๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด๋‹ค

 

 

 

Multi-task Loss์˜ ๊ตฌ์กฐ

 

๋ถ„๋ฅ˜ ์†์‹ค $L_{cls}$ ๊ณผ ํšŒ๊ท€ ์†์‹ค $L_{loc}$ ์˜ ๊ฒฐํ•ฉ

 

์ด ์†์‹ค ํ•จ์ˆ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ž‘๋™ํ•œ๋‹ค

  1. $L_{cls}$ (๋ถ„๋ฅ˜ ์†์‹ค): RoI๊ฐ€ ์–ด๋–ค ๋ฌผ์ฒด(ํ˜น์€ ๋ฐฐ๊ฒฝ)์ธ์ง€ ๋ถ„๋ฅ˜ (Log Loss ์‚ฌ์šฉ)
  2. $L_{loc}$ (ํšŒ๊ท€ ์†์‹ค): RoI์˜ ์ขŒํ‘œ๋ฅผ ์ •๋‹ต($v$)์— ๊ฐ€๊น๊ฒŒ ์กฐ์ • (Smooth L1 Loss ์‚ฌ์šฉ)
  3. $[u \ge 1]$ (๋ฐฐ๊ฒฝ ์ œ์™ธ): $\text{RoI}$๊ฐ€ ๋ฐฐ๊ฒฝ ํด๋ž˜์Šค($u=0$)์ผ ๊ฒฝ์šฐ, ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค๋ฅผ ์กฐ์ •ํ•  ํ•„์š”๊ฐ€ ์—†์œผ๋ฏ€๋กœ $\text{L}_{loc}$์˜ ๊ณ„์‚ฐ์„ ์ƒ๋žต(๋ฌด์‹œ)ํ•œ๋‹ค

์ด๋Ÿฌํ•œ Multi-task Loss ๊ตฌ์กฐ๋ฅผ ํ†ตํ•ด, ๋ถ„๋ฅ˜์™€ ํšŒ๊ท€์—์„œ ๋ฐœ์ƒํ•œ ์˜ค์ฐจ(Gradient)๊ฐ€ ํ•˜๋‚˜์˜ RoI Pooling ๋ ˆ์ด์–ด๋ฅผ ๊ฑฐ์ณ CNN์˜ ๊ฐ€์ค‘์น˜๋กœ ๋™์‹œ์— ์ „๋‹ฌ๋œ๋‹ค

์ด๋กœ์จ CNN์€ "๋ถ„๋ฅ˜์—๋„ ์ข‹๊ณ , ์œ„์น˜ ์กฐ์ •์—๋„ ์ข‹์€" ํŠน์ง•์„ ์ถ”์ถœํ•˜๋„๋ก ํ†ตํ•ฉ์ ์œผ๋กœ ์ตœ์ ํ™”๋œ๋‹ค

 

 

Smooth L1 Loss

 

 

์™œ L1๋„ L2๋„ ์•„๋‹Œ  Smooth L1 Loss ๋ฅผ ์‚ฌ์šฉํ–ˆ์„๊นŒ?

 

 

 

 

๊ฐ„๋‹จํ•˜๊ฒŒ ๋ฆฌ๋งˆ์ธ๋“œ๋ฅผ ํ•˜๊ณ  ๊ฐ€์ž๋ฉด

L1 Loss๋Š” ์‹ค์ œ๊ฐ’๊ณผ ์˜ˆ์ธก๊ฐ’ ์˜ค์ฐจ์˜ ์ ˆ๋Œ“๊ฐ’์„ ์‚ฌ์šฉํ•˜๋ฉฐ, ์ผ์ฐจ์‹์ด์—ฌ์„œ ์ด์ƒ์น˜์˜ ์˜ํ–ฅ์ด ๋œํ•˜์ง€๋งŒ V ํ˜•ํƒœ๋กœ ๋ฏธ๋ถ„๋ถˆ๊ฐ€๋Šฅํ•œ ์ง€์ ์ด ์žˆ๋‹ค๋Š” ํŠน์ง•์ด ์กด์žฌํ•œ๋‹ค

L2 Loss๋Š” ์‹ค์ œ๊ฐ’๊ณผ ์˜ˆ์ธก๊ฐ’ ์˜ค์ฐจ์˜ ์ œ๊ณฑ์„ ์‚ฌ์šฉํ•˜๋ฉฐ, ๋ชจ๋“  ๊ตฌ๊ฐ„์—์„œ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜์ง€๋งŒ ์ด์ƒ์น˜์— ๋ฏผ๊ฐํ•˜๋‹ค๋Š” ํŠน์ง•์ด ์กด์žฌํ•œ๋‹ค

 

  • Smooth L1 Loss๋Š” R-CNN ๋ฐ SPPnet์—์„œ ์‚ฌ์šฉ๋œ L2 Loss๋ณด๋‹ค ์ด์ƒ์น˜(Outliers)์— ๋œ ๋ฏผ๊ฐํ•˜๊ณ  ๋กœ๋ฒ„์ŠคํŠธ(Robust)ํ•œ L1 ์†์‹ค์ด๋‹ค
  • |x| < 1 ์ด๋ฉด L2 Loss๋ฅผ, ๊ทธ ์™ธ์—๋Š” L1 Loss๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค
  • ํšŒ๊ท€ ํƒ€๊ฒŸ์ด ๋ฌดํ•œ์ • ์ปค์งˆ ์ˆ˜ ์žˆ์„ ๋•Œ, L2 Loss๋กœ ํ•™์Šตํ•˜๋ฉด ๊ธฐ์šธ๊ธฐ ํญ๋ฐœ(Exploding gradients) ์„ ๋ง‰๊ธฐ ์œ„ํ•ด ํ•™์Šต๋ฅ ์„ ์‹ ์ค‘ํ•˜๊ฒŒ ์กฐ์ •ํ•ด์•ผ ํ•˜๋Š”๋ฐ Smooth L1์€ ์ด๋Ÿฌํ•œ ๋ฏผ๊ฐ๋„๋ฅผ ์ œ๊ฑฐํ•œ๋‹ค

 

https://www.researchgate.net/figure/Plots-of-the-L1-L2-and-smooth-L1-loss-functions_fig4_321180616

 

 

๊ฒฐ๋ก : Smooth L1 Loss๋Š” ์˜ค์ฐจ๊ฐ€ ์ž‘์€ ์˜์—ญ์—์„œ๋Š” L2์˜ ์žฅ์ (๋ถ€๋“œ๋Ÿฌ์šด ๋ฏธ๋ถ„)์„ ์ทจํ•˜๊ณ , ์˜ค์ฐจ๊ฐ€ ํฐ ์˜์—ญ์—์„œ๋Š” L1์˜ ์žฅ์ (์ผ์ •ํ•œ ๊ธฐ์šธ๊ธฐ)์„ ์ทจํ•จ์œผ๋กœ์จ, ํšŒ๊ท€ ๋ชจ๋ธ์˜ ์ดˆ๊ธฐ ์˜ค์ฐจ๊ฐ€ ํฌ๋”๋ผ๋„ CNN์˜ ๊ฐ€์ค‘์น˜๊ฐ€ ๊ณผ๋„ํ•˜๊ฒŒ ํ”๋“ค๋ฆฌ๊ฑฐ๋‚˜ ํญ๋ฐœํ•˜๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜์—ฌ ํ†ตํ•ฉ๋œ End-to-End ํ•™์Šต ํŒŒ์ดํ”„๋ผ์ธ์˜ ์•ˆ์ •์„ฑ์„ ๋ณด์žฅํ•œ๋‹ค

 

 

Fast R-CNN์ด End-to-End ํ•™์Šต์„ ๋„์ž…ํ•˜๋ฉด์„œ, ๊ฒฝ๊ณ„ ์ƒ์ž ํšŒ๊ท€ ๋ชจ๋ธ($L_{loc}$)์—์„œ ๋ฐœ์ƒํ•œ ์˜ค์ฐจ์˜ ๊ธฐ์šธ๊ธฐ๊ฐ€ RoI Pooling์„ ๊ฑฐ์ณ CNN์˜ ๋ชจ๋“  ๊ฐ€์ค‘์น˜์—๊นŒ์ง€ ์ง์ ‘ ์ „๋‹ฌ๋˜๋Š”๋ฐ, ์ด ๊ณผ์ •์—์„œ ํ•™์Šต์˜ ์•ˆ์ •์„ฑ์„ ํ™•๋ณดํ•˜๋Š” ๊ฒƒ์ด ๋งค์šฐ ์ค‘์š”ํ•ด์กŒ๋‹ค

 

 

https://hongl.tistory.com/345

 

Smooth L1 Loss vs Huber Loss

์ผ๋ฐ˜์ ์ธ ํšŒ๊ท€ (regression) ํ›ˆ๋ จ์—๋Š” ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’ ์ฐจ์ด์˜ ์ ˆ๋Œ“๊ฐ’์ธ L1 loss๋‚˜ ์˜ˆ์ธก๊ฐ’๊ณผ ์‹ค์ œ๊ฐ’ ์ฐจ์ด์˜ ์ œ๊ณฑ์ธ L2 loss๋ฅผ ๋ชฉ์  ํ•จ์ˆ˜๋กœ ์‚ฌ์šฉํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์ž˜ ์•Œ๋ ค์ ธ ์žˆ๋‹ค์‹œํ”ผ L2 loss๋Š” ๋ชจ๋“  ๊ตฌ๊ฐ„์—์„œ

hongl.tistory.com

 

 

์—ฌ๊ธฐ์„œ Loss ํ•จ์ˆ˜ ์„ค๋ช… ์ž˜ ๋˜์–ด์žˆ๋‹ค

 

https://woochan-autobiography.tistory.com/920#7.%20Fast%20RCNN%EC%9D%98%20Loss%20Function

 

CV - Fast RCNN

๐Ÿ“Œ ์ด ๊ธ€์€ ๊ถŒ์ฒ ๋ฏผ๋‹˜์˜ ๋”ฅ๋Ÿฌ๋‹ ์ปดํ“จํ„ฐ ๋น„์ „ ์™„๋ฒฝ ๊ฐ€์ด๋“œ ๊ฐ•์˜๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ •๋ฆฌํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค. ๋ชฉ์ฐจ SPPNet์˜ ํ•œ๊ณ„ Fast RCNN๊ณผ SPPNet์˜ ์ฐจ์ด Fast RCNN ROI Pooling Softmax Fast RCNN์˜ ๊ตฌ์กฐ Fast RCNN์˜ Loss function

woochan-autobiography.tistory.com

 

 

 

์œ„์˜ 2๊ฐ€์ง€ ๋ฉ”์ธ ์•„์ด๋””์–ด ์™ธ์—๋„ ์—ฐ์‚ฐ๋Ÿ‰์ด ๋งŽ์€ FC ๋ ˆ์ด์–ด๋ฅผ ํŠน์ด๊ฐ’ ๋ถ„ํ•ด (Truncated SVD) ๋ฅผ ํ†ตํ•ด ์ˆœ์ „ํŒŒ์‹œ ๊ฑธ๋ฆฌ๋Š” ์‹œ๊ฐ„์„ ๋‹จ์ถ•์‹œํ‚จ ๋ฐฉ๋ฒ•๋„ ์กด์žฌํ•˜์ง€๋งŒ ์Šฌํ”„๊ฒŒ๋„ ํŠน์ด๊ฐ’ ๋ถ„ํ•ด๋ฅผ ์ œ๋Œ€๋กœ ์ดํ•ดํ•˜์ง€ ๋ชปํ•ด์„œ ์ •๋ฆฌํ•˜์ง„ ๋ชปํ–ˆ๋‹ค

๊ฐ„๋‹จํ•˜๊ฒŒ ์„ค๋ช… ์ถ”๊ฐ€ํ•˜์ž๋ฉด ํ•˜๋‚˜์˜ FC ๋ ˆ์ด์–ด์—์„œ ๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ, W๋ฅผ ํŠน์ด๊ฐ’ ๋ถ„ํ•ดํ•˜์—ฌ ๋‘ ๊ฐœ์˜ ๊ฐ€์ค‘์น˜ ํ–‰๋ ฌ๋กœ ๋งŒ๋“ ๋‹ค

์ฆ‰, ํ•˜๋‚˜์˜ ํฌ๊ณ  ๋ฌด๊ฑฐ์šด FC ๋ ˆ์ด์–ด๋ฅผ ๋‘ ๊ฐœ์˜ ์ž‘๊ณ  ๊ฐ€๋ฒผ์šด ๋ ˆ์ด์–ด๋กœ ๋ถ„ํ•ดํ•˜๋Š” ๊ฒƒ์ด๋‹ค

 

์„ ํ˜•๋Œ€์ˆ˜๋„ ๊ฐ™์ด ์—ด์‹ฌํžˆ ๊ณต๋ถ€ํ•˜๊ธฐ...

 

https://holamundo.tistory.com/entry/prerequisite-Fast-R-CNN-Truncated-SVD

 

(prerequisite-Fast R-CNN) Truncated SVD

SVD(Singular Value Decomposition) HTML ์‚ฝ์ž… ๋ฏธ๋ฆฌ๋ณด๊ธฐํ•  ์ˆ˜ ์—†๋Š” ์†Œ์Šค It is decomposed the matrix into three matrices when A is an m × n matrix. Each of the three matrices meets the following conditions. HTML ์‚ฝ์ž… ๋ฏธ๋ฆฌ๋ณด๊ธฐํ•  ์ˆ˜ ์—†๋Š”

holamundo.tistory.com

 

 

 

์ด์ œ ์ง€๊ธˆ๊นŒ์ง€์˜ ๋‚ด์šฉ๋“ค์„ ์ดํ•ดํ•˜๊ณ ์„œ ์•„๋ž˜์˜ ๊ตฌ์กฐ ๋น„๊ต๋ฅผ ๋ณด๋ฉด ๋” ์ž˜ ์ดํ•ด๋  ๊ฒƒ์ด๋‹ค

https://bkshin.tistory.com/entry/%EC%BB%B4%ED%93%A8%ED%84%B0-%EB%B9%84%EC%A0%84-10-R-CNN-vs-SPP-net-vs-Fast-R-CNN-vs-Faster-R-CNN-%EA%B0%9C%EC%9A%94

 

 

 

 

 

 

 

์•„์ง ํ•ด๊ฒฐํ•ด์•ผํ•  ๋ฌธ์ œ๋“ค

 

Fast R-CNN์€ ์ด์ „์˜ ์—ฌ๋Ÿฌ ๋ฌธ์ œ์ ๋“ค์„ ๊ฐœ์„ ํ–ˆ์œผ๋‚˜ ์—ฌ์ „ํžˆ ์†๋„๊ฐ€ ๋А๋ฆฐ ๋ณ‘๋ชฉ ์ง€์ ์ด ์กด์žฌํ•˜๋ฉฐ ์•„์ง ์™„๋ฒฝํ•œ End-to-end ๋ชจ๋ธ์ด๋ผ๊ณ  ํ•  ์ˆ˜ ์—†๋‹ค

๋ฐ”๋กœ Region Proposal, RoI(Region of Interest)๋ฅผ ์ถ”์ถœํ•˜๋Š” ๋‹จ๊ณ„๋ฅผ Selective Search๋ผ๋Š” CPU ๊ธฐ๋ฐ˜์˜ ์™ธ๋ถ€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ์˜์กดํ•œ๋‹ค๋Š” ์ ์ด๋‹ค

 

Selective Search๋Š” ์ด๋ฏธ์ง€ ํ•œ ์žฅ๋‹น RoI๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์•ฝ 2์ดˆ์˜ ์‹œ๊ฐ„์ด ์†Œ์š”๋œ๋‹ค

๋„คํŠธ์›Œํฌ ๋‚ด๋ถ€ ์—ฐ์‚ฐ (CNN ํ”ผ์ณ ์ถ”์ถœ ๋ฐ ๋ถ„๋ฅ˜, ํšŒ๊ท€) ์ด 0.32์ดˆ์— ๋ถˆ๊ณผํ•ด๋„, ์ „์ฒด ์‹œ์Šคํ…œ์€ 2์ดˆ ์ด์ƒ์„ ๊ธฐ๋‹ค๋ ค์•ผ ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ง„์ •ํ•œ ์˜๋ฏธ์˜ ์‹ค์‹œ๊ฐ„ ํƒ์ง€(Real-time Detection) ๋ถˆ๊ฐ€๋Šฅํ•˜์˜€๋‹ค

 

๋”ฐ๋ผ์„œ ํƒ์ง€ ํŒŒ์ดํ”„๋ผ์ธ์˜ ๋ชจ๋“  ์—ฐ์‚ฐ์„ ์‹ ๊ฒฝ๋ง๊ณผ GPU ๋‚ด๋ถ€์—์„œ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•œ๋‹ค๋Š” ํ•ด๊ฒฐํ•ด์•ผํ•  ๋ฌธ์ œ์ , ์ด ๋งˆ์ง€๋ง‰ ๋ณ‘๋ชฉ(Selective Search)์„ ์ œ๊ฑฐํ•˜๊ณ , ๊ฐ์ฒด ์ œ์•ˆ(Proposal) ๋‹จ๊ณ„๋ฅผ RPN(Region Proposal Network) ์ด๋ผ๋Š” ๋„คํŠธ์›Œํฌ๋กœ ๋ณ‘ํ•ฉํ•œ ๋ชจ๋ธ์ด ๋‹ค์Œ ๋ชจ๋ธ์ธ Faster R-CNN์ด๋‹ค

 

 

 

 

 

 

 


 

 

 

์ด์ „์˜ R-CNN, SPPNet ๊นŒ์ง€๋Š” ์ฃผ์š” ๋‚ด์šฉ๋“ค๊นŒ์ง€๋งŒ ์ด๋ก ์ ์œผ๋กœ ์–ผ์ถ” ์ดํ•ด๋ฅผ ์กฐ๊ธˆ ํ•œ ๊ฒƒ ๊ฐ™์œผ๋ฉด์„œ๋„ ์—ฌ์ „ํžˆ ์ฝ”๋“œ๋ž‘ ์‹ค์ œ ๊ตฌํ˜„๊นŒ์ง€ ๋”ฐ๋ผ๊ฐ€๋Š”๊ฒŒ ์ •๋ง ์–ด๋ ต๋‹ค... ์ด๋ก ์ ์œผ๋กœ๋„ ์ˆ˜ํ•™๊ณผ ์—ฌ๋Ÿฌ ๊ฐœ๋…๋“ค ๋“ฑ ํŒŒ๊ณ ๋“ค ๋ถ€๋ถ„์ด ๋„ˆ๋ฌด๋‚˜๋„ ๋งŽ๋‹ค

 

https://herbwood.tistory.com/9

 

Pytorch๋กœ ๊ตฌํ˜„ํ•œ Fast R-CNN ๋ชจ๋ธ

์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” gary1346aa๋‹˜์˜ github repository์— ์˜ฌ๋ผ์˜จ pytorch๋กœ ๊ตฌํ˜„ํ•œ Fast R-CNN ์ฝ”๋“œ๋ฅผ ๋ถ„์„ํ•ด๋ณด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. jupyter notebook์œผ๋กœ ์ž‘์„ฑ๋˜์–ด ์žˆ์–ด ์ฝ”๋“œ๋ฅผ ์ƒ๋Œ€์ ์œผ๋กœ ์ฝ๊ธฐ๊ฐ€ ํŽธํ–ˆ๋˜ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ

herbwood.tistory.com

 

๊ทธ๋ž˜๋„ ์ž๋ฃŒ๋ž‘ AI ๋„์›€ ๋ฐ›์•„์„œ ์ตœ๋Œ€ํ•œ ์ดํ•ด๊ฐ€ ์•ˆ๊ฐ€๋Š” ๋ถ€๋ถ„๋“ค ์ฐพ์•„์„œ ์กฐ๊ธˆ์”ฉ ์ฑ„์›Œ๋ด์•ผ๊ฒ ๋‹ค

Faster R-CNN๊ณผ ๊ฐ™์ด ์ •๋ฆฌํ•ด๋ณด๋ ค ํ–ˆ์œผ๋‚˜ Faster R-CNN์— ๊ฐ€๋ฉด์„œ ๋ญ”๊ฐ€ ๋‚ด์šฉ์ด ๋” ์–ด๋ ค์›Œ์ง€๊ณ  ํ™• ๋„“์–ด์ง„ ๋А๋‚Œ...

์ตœ๋Œ€ํ•œ ๊ธ€๋กœ ์ž˜ ์ •๋ฆฌํ•ด๋ณด๊ณ  ์‹ถ์€๋ฐ ์ดํ•ด ๊ฐ€๋Šฅํ•œ ๋ถ€๋ถ„๋“ค๋ถ€ํ„ฐ ๊พธ์ค€ํžˆ ์ •๋ฆฌํ•ด๋ด์•ผ๊ฒ ๋‹ค