U-stage day 7

1.강의 내용

[DL Basic]Optimization

Bagging vs Boosting
Bagging: Multiple models are being trained with boostrappinng. ex) Ensemble
Boosting: It focuses on those specific training samples that are hard to classify.
Gradient Descent

First-order iterative optimization algorithm for finding a local minimum of a differentiable function.
Gradient Descent Methods
Stochastic gradient descent, Momentum, Nesterov accelerated gradient, Adagrad, Adadelta, RMSprop, Adam
이 중 Adam에 대해 좀더 살펴보면,
Adam: Adaptive Moment Estimation leverages both past gradients and squared gradients

Adam effectively combines momentum with adaptive learning rate approach.
Regularization

1)Early Stopping

2)Parameter Norm Penalty

3)Data Augmentation

4)Noise robustness : Add random noises inputs or weights.

5)Label smoothing
Mix-up constructs augmented training examples by mixing both input and output of two randomly selected training data.

CutMix constructs augmented training examples by mixing inputs with cut and paste and outputs with soft labels of two randomly selected training data.

6)Dropout : In each forward pass, randomly set some neurons to zero.

7)Batch normalization
Batch normalization compute the empirical mean and variance independently for each dimension (layers) and normalize. There are different variances of normalizations.

실습
필수과제 2 내용

2.과제 수행 과정/결과물 정리

Define MLP model

class Model(nn.Module):
    def __init__(self,name='mlp',xdim=1,hdims=[16,16],ydim=1):
        super(Model, self).__init__()
        self.name = name
        self.xdim = xdim
        self.hdims = hdims
        self.ydim = ydim

        self.layers = []
        prev_hdim = self.xdim
        for hdim in self.hdims:
            self.layers.append(nn.Linear(
                prev_hdim, hdim, bias = True
            ))
            self.layers.append(nn.Tanh())  # activation
            prev_hdim = hdim
        # Final layer (without activation)
        self.layers.append(nn.Linear(prev_hdim,self.ydim,bias=True))

        # Concatenate all layers 
        self.net = nn.Sequential()
        for l_idx,layer in enumerate(self.layers):
            layer_name = "%s_%02d"%(type(layer).__name__.lower(),l_idx)
            self.net.add_module(layer_name,layer)

        self.init_param() # initialize parameters
    
    def init_param(self):
        for m in self.modules():
            if isinstance(m,nn.Conv2d): # init conv
                nn.init.kaiming_normal_(m.weight)
                nn.init.zeros_(m.bias)
            elif isinstance(m,nn.Linear): # lnit dense
                nn.init.kaiming_normal_(m.weight)
                nn.init.zeros_(m.bias)
    
    def forward(self,x):
        return self.net(x)

3.피어 세션

학습 내용 공유

1.과제 코드 리뷰

필수과제 2 특성상 스킵

1.강의 내용 및 심화내용 토론

[DL Basic]Optimization

3.논문 리뷰

1.VGG
2.Batch Normalization

4.학습회고

이고잉 님의 Git 특강을 들었습니다.
Git은 확실히 다루기 어렵지만, 들으면서 왜 개발자의 필수역량인지 알 수 있었습니다.
논문 리뷰를 통해 한 모델이 나오기까지의 과정을 간략하게 알 수 있었습니다.
day 9에 제 논문 발표가 있을 예정인데 처음인만큼 시간을 많이 할애하여 볼 예정입니다.

Author And Source

이 문제에 관하여(U-stage day 7), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://velog.io/@tkrhdwls/U-stage-day-7

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다