Keras 에서 center-loss 손실 함수 사용\Keras 사용자 정의 손실 함수

목차
거인 의 어깨 위 에 서다.

2.Keras 의 손실 함수

3.Keras 에서 center-loss 손실 함수 실현

가 져 오기 라 이브 러 리 와 정의 상수다 원 분류 softmax 손실 함수 실현center-loss 손실 함수 실현4.567917.3.4.softmax-loss 와 center-loss 를 더 하면 4.567918.
4.567917.4.모델 에서 새로운 손실 함 수 를 사용 합 니 다전체 코드
1.거인 의 어깨 위 에 서서

TensorFlow 는 center loss 를 실현 합 니 다.

손실 함수 개선의 Center Loss
2.Keras 의 손실 함수
모두 가 알 고 있 을 것 입 니 다.(model.copile)모델 을 컴 파일 할 때 loss 라 는 매개 변 수 를 지정 해 야 합 니 다.즉,손실 함수 입 니 다.실제로 Keras 는 이미 많은 손실 함 수 를 실현 했다.예 를 들 어:

#     （     ）    
def categorical_crossentropy(y_true, y_pred):
    return K.categorical_crossentropy(y_true, y_pred)
#           
def sparse_categorical_crossentropy(y_true, y_pred):
    return K.sparse_categorical_crossentropy(y_true, y_pred)
#         
def binary_crossentropy(y_true, y_pred):
    return K.mean(K.binary_crossentropy(y_true, y_pred), axis=-1)

위의 손실 함 수 를 관찰 하면 매개 변 수 는 모두 y 이다.true 와 ypred。y_true 는 onehot 인 코딩 후의 태그,shape 는(batchsize，NUM_CLASSES)；y_pred 는 모델 의 마지막 층(일반적으로 softmax 층)의 출력 이 고 shape 는(batchsize，NUM_CLASSES)；손실 함수 가 되 돌아 오 는 shape 는(batchsize，)。그 중 NUMCLASSES 는 다 중 분류 문제 에서 분류의 수량 입 니 다.따라서 손실 함 수 를 사용자 정의 하려 면 매우 간단 합 니 다.하나의 함수 만 정의 하면 다음 과 같 습 니 다.

def my_loss(y_true, y_pred):
    m_loss=...
    return m_loss

3.Keras 에서 center-loss 손실 함수 실현
세 걸음 가!첫 번 째 단 계 는 원본 모델 의 마지막 softmax 층 을 제거 하고 마지막 fc 층 의 출력 을 직접 가 져 오 는 것 입 니 다.center-loss 는 fc 층 의 출력 을 입력 으로 가 져 와 야 하기 때 문 입 니 다.두 번 째 단 계 는 다 원 분류 softmax 손실 함 수 를 실현 하 는 것 이다.세 번 째 단 계 는 center-loss 손실 함 수 를 실현 하 는 것 이다.그리고 추가 적 인 단 계 는 2 부의 softmax 손실 에 3 단계 의 center-loss 손실 을 더 하 는 것 이다.제 keras 는 tensorflow 를 백 엔 드 로 사용 하기 때문에 tensorflow 코드 를 사용 하여 손실 함 수 를 실현 할 수 있 습 니 다.또한 코드 의 이식 성 을 실현 하기 위해 서 는 keras.backend 의 방법 으로 손실 함 수 를 실현 하 는 것 을 추천 합 니 다.세 번 째 부분의 코드 는 cl.py 에서 이 루어 집 니 다.
3.1 라 이브 러 리 가 져 오기 와 상수 정의
모형 을 없 애 는 softmax 층 은 간단 해서 더 이상 말 하지 않 겠 습 니 다.다음은 라 이브 러 리 가 져 오기 와 상수 정의:

# coding=utf-8
# cl.py
#   python3
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from tensorflow import keras
from tensorflow.python.keras import backend as K
import tensorflow as tf
import numpy as np
import random

#        ，fc        
NUM_CLASSES = 240
#         
ALPHA = 0.6
# center-loss   
LAMBDA = 0.0005

ALPHA 는 업데이트 센터 의 학 습 률 로 보통 0.6 또는 0.5 로 설정 된다.LAMBDA 는 문장의 center-loss 항목 의 계수 로 이 매개 변 수 는 자세하게 조련 해 야 좋 은 효 과 를 얻 을 수 있다.
3.2 다 원 분류 softmax 손실 함수 실현
여기 서 말 하 는 다 중 분류 softmax 손실 함 수 는 categorical 입 니 다.crossentropy(다 중 분류 손실 함수,라벨 은 one 을 사용 해 야 합 니 다.핫 코딩아주 간단 합 니 다.다음 과 같 습 니 다.

def softmax_loss(labels, features):
    """
      softmax-loss
    :param labels:    y_true，   one_hot  ，shape  (batch_size, NUM_CLASSES)
    :param features:    y_pred，       fc (  softmax )   ，shape  (batch_size, NUM_CLASSES)
    :return:      softmax-loss  ，shape (batch_size, )
    """
    return K.categorical_crossentropy(labels, K.softmax(features, axis=-1))

여 기 는 왜 keras.loss.categorical 을 직접 사용 하지 않 고 다시 실현 해 야 합 니까?crossentropy 는 요?모델 의 softmax 층 을 제 거 했 기 때문에 모델 의 마지막 층 에 대한 출력,즉 features 입 니 다.softmax 를 호출 하여 처리 하고 K.categorical 로 해 야 합 니 다.crossentropy 의 인자 입 니 다.
3.3 center-loss 손실 함수 실현
저 는 여기 서 또 다른 큰 남자 가 tensor flow 를 위해 실현 한 center-loss 코드 를 참고 하 였 습 니 다.제 keras 는 tensor flow 를 백 엔 드 로 사용 하기 때문에 이치 에 따라 직접 사용 할 수 있 습 니 다.center-loss 를 계산 하 는 코드 를 먼저 드 립 니 다.

def center_loss(labels, features, alpha=_g.ALPHA, num_classes=_g.NUM_CLASSES):
    """
      center loss      center
    :param labels: Tensor,    label, one-hot  ,shape  (batch_size,).
    :param features: Tensor,      ,    fc    ,shape   (batch_size, num_classes).
    :param alpha: 0-1     ,            ,      .
    :param num_classes:   ,          ,                   .
    :return: Tensor, center-loss， shape  (batch_size,)
    """
    #        ，  256 
    len_features = features.get_shape()[1]
    #     Variable,shape [num_classes, len_features]，             ，
    #   trainable=False                 
    centers = tf.get_variable('centers', [num_classes, len_features], dtype=tf.float32,
                              initializer=tf.constant_initializer(0), trainable=False)
    #  label      ，  labels      ，         
    labels = tf.reshape(labels, [-1])

    #     label,  mini-batch            
    centers_batch = tf.gather(centers, labels)

    #   mini-batch                 
    diff = centers_batch - features

    #   mini-batch            ,           (4)
    unique_label, unique_idx, unique_count = tf.unique_with_counts(labels)
    appear_times = tf.gather(unique_count, unique_idx)
    appear_times = tf.reshape(appear_times, [-1, 1])

    diff = diff / tf.cast((1 + appear_times), tf.float32)
    diff = alpha * diff

    #   centers
    centers_update_op = tf.scatter_sub(centers, labels, diff)

    #     tf.control_dependencies  centers
    with tf.control_dependencies([centers_update_op]):
        #   center-loss
        c_loss = tf.nn.l2_loss(features - centers_batch)

    return c_loss

원래 코드 에서 c 로 되 돌아 가기 때문에 제 가 겪 었 던 어 려 운 문 제 를 말씀 드 리 겠 습 니 다.loss,centers,centersupdate_op,그리고 sess.run 을 사용 하여 centers 를 업데이트 하지만 손실 함 수 는 손실 항목 c 만 되 돌려 줍 니 다.loss,centers 로 돌아 갈 수 없습니다update_op,centers 를 업데이트 할 수 없습니다.마지막 으로 생각 나 는 방법 은 tf.control 을 사용 하 는 것 입 니 다.dependencies 강제 계산 closs 전에 centers 를 업데이트 하면 centers 가 업 데 이 트 됩 니 다.
3.4,softmax-loss 와 center-loss 를 더 합 니 다.

def loss(labels, features):
    """
                 ，  softmax-loss       center-loss
    :param labels: Tensor，   y_true，   one_hot  ，shape  (batch_size, NUM_CLASSES)
    :param features: Tensor，    y_pred,        fc (  softmax )   ，shape  (batch_size, NUM_CLASSES)
    :return: softmax-loss       center-loss
    """
    labels = K.cast(labels, dtype=tf.float32)
    #   softmax-loss
    sf_loss = softmax_loss(labels, features)
    #   center-loss，  labels   one_hot   ，       argmax        
    c_loss = center_loss(K.argmax(labels, axis=-1), features)
    return sf_loss + LAMBDA * c_loss

LAMBDA 이것 은 자세하게 조련 해 야 하 는 매개 변수 로 나의 모델 에 대해 서로 다른 크기 가 모델 훈련 에 미 치 는 효과 에 큰 영향 을 미친다.새로운 softmax-loss with center-loss 테스트:

if __name__ == '__main__':
    #   label     features
    test_features = np.random.randn(32, NUM_CLASSES).astype(dtype=np.float32)
    test_labels = np.array(random.sample(range(0, NUM_CLASSES - 1), 32))
    test_labels[0] = 0
    # one_hot  
    test_labels = keras.utils.to_categorical(test_labels, NUM_CLASSES)

    print(test_features.shape, test_labels.shape)

    #   tensor
    test_features = tf.constant(test_features)
    test_labels = tf.constant(test_labels)
    #        op
    loss_op = loss(test_labels, test_features)
    with tf.Session() as sess:
        #      
        sess.run(tf.global_variables_initializer())
        #     
        result = sess.run(loss_op)
        print(result.shape)
        print(result)
        #  centers   ，       
        centers = sess.graph.get_tensor_by_name('centers:0')
        print(centers.eval().shape)
        print(centers.eval())

4.모델 에서 새로운 손실 함수 사용
새로운 손실 함 수 를 사용 하지 않 기 전에 모델 의 마지막 층 은 softmax 층 이 고 컴 파일 모델 과 훈련 모델 의 일부 코드 는 다음 과 같 습 니 다.

    #      ，train_dataset tf.data.Dataset  ，train_steps          
    train_dataset, train_steps = ...
    #      
    val_dataset, val_steps = ...
    #     
    model = ...
    #     
    model.compile(optimizer=keras.optimizers.Adam(lr=0.0001, decay=1e-5),
                  loss=keras.losses.categorical_crossentropy, #     
                  metrics=[keras.metrics.categorical_accuracy]) #   
    #     
    model.fit(train_dataset, epochs=10, steps_per_epoch=train_steps,
              validation_data=val_dataset, validation_steps=val_steps)

center-loss 를 사용 한 후 컴 파일 과 훈련 모델 의 코드 는 다음 과 같 습 니 다.

    import cl
    
    #      ，train_dataset tf.data.Dataset  ，train_steps          
    train_dataset, train_steps = ...
    #      
    val_dataset, val_steps = ...
    #     
    model = ...
    #     
    model.compile(optimizer=keras.optimizers.Adam(lr=0.0001, decay=1e-5),
                  loss=cl.loss, #     
                  metrics=[cl.categorical_accuracy]) #   
    #         
    sess = K.get_session()
    sess.run(tf.global_variables_initializer())
    #     
    model.fit(train_dataset, epochs=10, steps_per_epoch=train_steps,
              validation_data=val_dataset, validation_steps=val_steps)

특히 주의해 야 할 것 은 컴 파일(copile)모델 을 컴 파일 한 후 훈련(fit)이나 평가(evaluate)전에 tf.global 을 사용 해 야 한 다 는 점 이다.variables_initializer 는 center-loss 함수 에서 정의 하 는 centers 를 초기 화 합 니 다.그렇지 않 으 면 초기 화 되 지 않 은 변 수 를 사용 하려 는 것 이 잘못 되 었 습 니 다.위의 코드 를 관찰 하고 지표 뒤의 함수 cl.categoricalaccuracy 도 cl.py 에서 다시 실현 되 었 습 니 다.

def categorical_accuracy(y_true, y_pred):
    """
      categorical_accuracy  ，     softmax    
    :param y_true:    labels，
    :param y_pred:    features。
    :return:    
    """
    #   y_pred softmax 
    sm_y_pred = K.softmax(y_pred, axis=-1)
    #      
    return K.cast(K.equal(K.argmax(y_true, axis=-1), K.argmax(sm_y_pred, axis=-1)), K.floatx())

원래 keras.metrics.categoricalaccuracy 에 비해 모델 의 마지막 softmax 층 을 제 거 했 기 때문에 softmax 함수 처리 가 필요 합 니 다 ypred。
5.전체 코드
여러분 의 편 의 를 위해 전체 코드 를 붙 입 니 다.

# coding=utf-8
#   python3
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from tensorflow import keras
from tensorflow.python.keras import backend as K
import tensorflow as tf
import numpy as np
import random

#        ，fc        
NUM_CLASSES = 240
#         
ALPHA = 0.6
# center-loss   
LAMBDA = 0.0005


def center_loss(labels, features, alpha=ALPHA, num_classes=NUM_CLASSES):
    """
      center loss      center
    :param labels: Tensor,    label, one-hot  ,shape  (batch_size,).
    :param features: Tensor,      ,    fc    ,shape   (batch_size, num_classes).
    :param alpha: 0-1     ,            ,      .
    :param num_classes:   ,          ,                   .
    :return: Tensor, center-loss， shape  (batch_size,)
    """
    #        ，  256 
    len_features = features.get_shape()[1]
    #     Variable,shape [num_classes, len_features]，             ，
    #   trainable=False                 
    centers = tf.get_variable('centers', [num_classes, len_features], dtype=tf.float32,
                              initializer=tf.constant_initializer(0), trainable=False)
    #  label      ，  labels      ，         
    labels = tf.reshape(labels, [-1])

    #     label,  mini-batch            
    centers_batch = tf.gather(centers, labels)

    #   mini-batch                 
    diff = centers_batch - features

    #   mini-batch            ,           (4)
    unique_label, unique_idx, unique_count = tf.unique_with_counts(labels)
    appear_times = tf.gather(unique_count, unique_idx)
    appear_times = tf.reshape(appear_times, [-1, 1])

    diff = diff / tf.cast((1 + appear_times), tf.float32)
    diff = alpha * diff

    #   centers
    centers_update_op = tf.scatter_sub(centers, labels, diff)

    #     tf.control_dependencies  centers
    with tf.control_dependencies([centers_update_op]):
        #   center-loss
        c_loss = tf.nn.l2_loss(features - centers_batch)

    return c_loss


def softmax_loss(labels, features):
    """
      softmax-loss
    :param labels:    y_true，   one_hot  ，shape  (batch_size, NUM_CLASSES)
    :param features:    y_pred，       FC (  softmax )   ，shape  (batch_size, NUM_CLASSES)
    :return:      softmax-loss  ，shape (batch_size, )
    """
    return K.categorical_crossentropy(labels, K.softmax(features, axis=-1))


def loss(labels, features):
    """
                 ，  softmax-loss       center-loss
    :param labels: Tensor，   y_true，   one_hot  ，shape  (batch_size, NUM_CLASSES)
    :param features: Tensor，    y_pred,        fc (  softmax )   ，shape  (batch_size, NUM_CLASSES)
    :return: softmax-loss       center-loss
    """
    labels = K.cast(labels, dtype=tf.float32)
    #   softmax-loss
    sf_loss = softmax_loss(labels, features)
    #   center-loss，  labels   one_hot   ，       argmax        
    c_loss = center_loss(K.argmax(labels, axis=-1), features)
    return sf_loss + LAMBDA * c_loss


def categorical_accuracy(y_true, y_pred):
    """
      categorical_accuracy  ，     softmax    
    :param y_true:    labels，
    :param y_pred:    features。
    :return:    
    """
    #   y_pred softmax 
    sm_y_pred = K.softmax(y_pred, axis=-1)
    #      
    return K.cast(K.equal(K.argmax(y_true, axis=-1), K.argmax(sm_y_pred, axis=-1)), K.floatx())


if __name__ == '__main__':
    #   label     features
    test_features = np.random.randn(32, NUM_CLASSES).astype(dtype=np.float32)
    test_labels = np.array(random.sample(range(0, NUM_CLASSES - 1), 32))
    test_labels[0] = 0
    # one_hot  
    test_labels = keras.utils.to_categorical(test_labels, NUM_CLASSES)

    print(test_features.shape, test_labels.shape)

    #   tensor
    test_features = tf.constant(test_features)
    test_labels = tf.constant(test_labels)
    #        op
    loss_op = loss(test_labels, test_features)
    with tf.Session() as sess:
        #      
        sess.run(tf.global_variables_initializer())
        #     
        result = sess.run(loss_op)
        print(result.shape)
        print(result)
        #  centers   ，       
        updated_centers = sess.graph.get_tensor_by_name('centers:0')
        print(updated_centers.eval().shape)
        print(updated_centers.eval())

이 내용에 흥미가 있습니까?

현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:

caffe 데이터 구조 깊이 학습 (4) - blob 데이터 구조 blob. hpp 파일 상세 해석

이 줄 은 shape 벡터 를 통 해 Blob 의 모양 을 바 꾸 는 또 다른 변형 함 수 를 정의 합 니 다. 이 줄 은 Blob 모양 의 함 수 를 읽 고 구성원 변수 shape 로 돌아 가 는 것 을 정의 합 ...

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.

Keras 에서 center-loss 손실 함수 사용\Keras 사용자 정의 손실 함수

좋은 웹페이지 즐겨찾기