ResNet

Paper Implementation : "Deep Residual Learning for Image Recognition.(2016)"
Code Practice : 아래 Colab과 Git 링크를 통해 어떻게 구현 되었는지 구체적으로 확인해 보실 수 있습니다.

Description

ResNet의 특징:
1.Residual Block:
- ResNet의 가장 큰 특징 은 Residual Block 개념 적용입니다. Residual Block은 Convolutional layer(층)를 깊게 쌓으면서 발생하는 gradient vanishing 문제를 방지 할 수 있습니다. Residual block은 convolution layer를 통과한 결과값에 input 값인 x를 더해주는 과정을 의미합니다. 구체적으로 Residual Block은 Plain netowrk와 Residual Network로 구분 할 수 있습니다.
a.) Plain network: VGGNet based
- Convolution layer: 3x3, stried = 2
- Global Average Pooling applied
- FC layer(1000), softmax
b.) Residual network: Plain Network based
- shortcut connection이 추가되었다
- 차원이 같을때 : 직접 값을 전달해 준다
- 차원이 증가 할때 : 2가지 옵션
  1. zero padding으로 차원을 맞춰준다
  2. 1x1 Conv layer로 차원을 맞춰준다(VGG or GoogleNet)
- stride = 2

Figure 1. Residual Block Type

2. ResNet Projection(Bottleneck Building block):

residual network의 Building block 방식은 두가지로 나타납니다. Identity 와 Projection 방식이 있습니다. 여기서 Projection 방식은 1x1 Convolution 연산을 통해 차원을 스케일링 해주는 역할을 합니다.

Figure 2. Residual Building Block Type

3. ResNet Architecture:

ResNet의 구조와 모델은 다음과 같습니다.

Figure 3. ResNet Architecture

4. Image Preprocessing

이미지 전처리는 RGB 채널에 대한 Mean subtraction을 적용하였습니다.
Input image shape = 224x224x3
Mean subtraction of RGB per channel

5. Layer

Batch Normalization 적용 (between Conv and Activation function)
Dropout = None
ResNet18, ResNet34 layer = Basic Building Block 사용
ResNet50, ResNet101, ResNet 152 = Bottleneck Builiding Block 사용
Max Pooling = 1개 (kernel size = 3x3, stride = 2)
Global Average Pooling = 1개 (stride = 1)
residual network에서 차원이 증가하는 경우 :
a.) zerro - padding 으로 차원 증가에 대응 합니다.
b.) 1x1 convolution으로 차원 스케일링 해준 뒤 다시 원래 차원으로 스케일링 진행합니다.

6. Hyper-parameter

Optimizer = SGD -> Adam으로 변경
Momentum = 0.9 -> 적용안함
Weight Decay = 0.0001 -> 적용안함
Batch size = 256 -> 128
learning rate = learning rate scheduler 사용(0.1에서 시작해서 x10씩 줄여나감) -> 0.001로 설정
Epoch = not mentioned -> 10 적용
60 × 104 iterations 만큼 학습(논문)
Dropout = None

7. Fully Connected Layer(FC Layer):

FC 층은 1개의 512 channel로 구성되어 있습니다.

8. Test Results:

Test 결과는 Accuracy, Loss, Classification Report, Confusion Matrix로 확인해 보실 수 있습니다.

9. Dataset:

논문 : ImageNet Large Scale Visual Recognition Challenge(ILSVRC)-2012, CIFAR-10
구현 : CIFAR-10

10. System Environment:

Goolge Colab Pro Plus GPU : K80(Kepler), T4(Turing), and P100(Pascal)
Jupyter Notebook, Visual Studio Code

Reference
[1] "[논문 구현] ResNet 파이토치로 구현하기", For a better world, 2022년 9월 18일 수정,
2023년 1월 20일 접속, https://roytravel.tistory.com/339.

[2] "[논문리뷰] ResNet(2015)설명", inhovation97, 2022년 1월 19일 수정, 2023년 1월 20일 접속,
https://inhovation97.tistory.com/46?category=920765.

[3] "resnet", weiaicunzai, 2020년 7월 30일 수정, 2023년 1월 20일 접속, https://github.com/weiaicunzai/pytorch-cifar100/blob/master/models/resnet.py.

[4] "resnet", pmeier, 2023년 1월 11일 수정, 2023년 1월 20일 접속, https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py.

[5] "[논문리뷰] ResNet - Deep Residual Learning for Image Recognition", 척척학부생이될거야, 2020년 8월 20일 수정,
2023년 1월 23일 접속, https://blackchopin.github.io/imagerecognition/ResNet/.

[6] K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770-778, doi: 10.1109/CVPR.2016.90.
https://arxiv.org/pdf/1512.03385.pdf

[7] Krizhevsky, A., & Hinton, G. (2009). Learning multiple layers of features from tiny images.http://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf

[8] "The CIFAR-10 dataset", Alex Krizhevsky's home page, 2009년 작성, 2022년 12월 28일 접속, https://www.cs.toronto.edu/~kriz/cifar.html.

'Artificial Intelligence > 컴퓨터 비전 (CV)' 카테고리의 다른 글

[PyTorch] Vision Transformer(ViT) 논문구현 (4)	2023.02.16
[PyTorch] VGGNet 논문구현 (0)	2023.02.08
[PyTorch] GoogLeNet 논문구현 (0)	2023.02.07
[PyTorch] AlexNet 논문구현 (0)	2023.02.06

AI with JP

[PyTorch] ResNet 논문구현

ResNet

'Artificial Intelligence > 컴퓨터 비전 (CV)' 카테고리의 다른 글

댓글

티스토리툴바

[PyTorch] ResNet 논문구현

ResNet

'Artificial Intelligence > 컴퓨터 비전 (CV)' 카테고리의 다른 글

관련글

댓글

티스토리툴바