Deep Photo Style Transfer를 저예산 환경에서 억지로 이동

9339 단어 CUDA Torch 루아 DeepLearning

2017년 3월 하순, 갑자기 나타난 새로운 화풍 변환에 관한 논문이 그 품질의 장점으로 화제를 불렀습니다.

GIGAZINE : 딥 러닝을 사용하여 "사진의 외형 특징"을 다른 사진으로 전송하는 "Deep Photo Style Transfer"

Github에서 공개된 추시 환경

논문

이것은 꼭 자신도 시도해보고 싶다. 그렇게 생각했습니다. 하지만 내가 시도하는 데 큰 문제가 서있었습니다.

문제 : 재현 환경 조달 비용이 높습니다.

한마디로, 공식적인 재현 환경의 조달 비용이 높습니다(Matlab 가지고 있지 않다, GPU는 GeForce 1050Ti 4GB).

Octave를 사용하면 잘 작동하지 않습니다.

본가의 코드는 Matlab을 전제로 하고 있으므로(Octave로 테스트되어 있지 않다??), 공식 순서대로 진행하면 Lua 스크립트의 실행 개시 지점에서 죽습니다. Matlab 높습니다.

Octave에서 전처리를 통해 루아 스크립트가 떨어지는 예

gpu, idx =  0   1   
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
loading matting laplacian...    gen_laplacian/Input_Laplacian_3x3_1e-7_CSR1.mat 
File could not be opened: gen_laplacian/Input_Laplacian_3x3_1e-7_CSR1.mat   
/home/kuni/work/misc/torch/install/bin/luajit: deepmatting_seg.lua.org:119: attempt to index a nil value
stack traceback:
    deepmatting_seg.lua.org:119: in function 'main'
    deepmatting_seg.lua.org:606: in main chunk
    [C]: in function 'dofile'
    ...misc/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x00405d50

모델이 GPU의 메모리를 타지 않음

또한 이 모델은 4GB의 GPU를 타지 않습니다. 따라서 비싼 GPU로 돌려야합니다. GPU 높습니다. 또, Torch는 CPU와 GPU의 간단한 전환 수단을 가지지 않습니다(구조화 수단이 없기 때문에 실장자에게 맡겨진다). 게다가 이 논문의 저자는 전환을 위한 구조화를 후회하고 있는 것 같습니다.

메모리 넘치는 오류의 예

gpu, idx =  0   1   
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
loading matting laplacian...    gen_laplacian/Input_Laplacian_3x3_1e-7_CSR1.mat 
Exp serial: examples/final_results  
Setting up style layer      2   :   relu1_1 
Setting up style layer      7   :   relu2_1 
Setting up style layer      12  :   relu3_1 
THCudaCheck FAIL file=/home/kuni/work/misc/torch/extra/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
/home/kuni/work/misc/torch/install/bin/luajit: ...i/work/misc/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 11 module of nn.Sequential:
...e/kuni/work/misc/torch/install/share/lua/5.1/nn/THNN.lua:110: cuda runtime error (2) : out of memory at /home/kuni/work/misc/torch/extra/cutorch/lib/THC/generic/THCStorage.cu:66
stack traceback:
    [C]: in function 'v'
    ...e/kuni/work/misc/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'SpatialConvolutionMM_updateOutput'
    ...sc/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:79: in function <...sc/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:76>
    [C]: in function 'xpcall'
    ...i/work/misc/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
    .../work/misc/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    deepmatting_seg.lua.org:162: in function 'main'
    deepmatting_seg.lua.org:606: in main chunk
    [C]: in function 'dofile'
    ...misc/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
    [C]: in function 'error'
    ...i/work/misc/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
    .../work/misc/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    deepmatting_seg.lua.org:162: in function 'main'
    deepmatting_seg.lua.org:606: in main chunk
    [C]: in function 'dofile'
    ...misc/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x00405d50

그래서 저예산 환경에서 할 수 있도록 아래 2 점을 해결하는 무리한 개조를했습니다.

리포지토리

본가

저예산 환경 대응판

덧붙여 전제로서 Octave, Lua, Torch의 동작 환경은 구축이 끝난 것으로 합니다(내 환경은 Ubuntu 16.04 Xenial Xerus).

변경 개요

Matlab 대신 Octave 사용 ( 차이 )

네트워크를 메인 메모리에 올려 CPU에서 계산한다 (그러나 일부 GPU 그대로) ( 차이 )

1. Matlab 대신 Octave 사용

차등 구현

입력 이미지의 라플라시안 연산을 전처리로 Matlab 또는 Octave에서 수행 필요합니다만, Octave로 실시하려면 2점 대응이 필요했습니다

im2double를 사용하기 위해 pkg load image

생성되는 파일이 Octave 표준 (matlab 비호환)이 되어 버리므로, 형식을 강제 지정 ( -mat-binary ) 한다

2. 네트워크를 메인 메모리에 올려 CPU로 계산한다(단 일부 GPU 그대로)

차등 구현

CudaTensor 한 장소를 오로지 FloatTensor 를 사용하도록 개조합니다. 다만, cuda_utils 만은 의존이 깊기 때문에 GPU 실장 그대로 실시하기로 했습니다.
개조의 핵심은 모형을 모순없이 대체하는 것입니다 (기계적 작업).

결과

입력 이미지 (변경 대상)

입력 이미지 (스타일 이미지)

연산 결과

자전의 사진을 example의 연속번호의 마지막에 연속해서 61번으로 해, 실행해 보았습니다. 훌륭하게 화풍 변환이 되어 있습니다.

iteration : 100

iteration : 500

iteration : 1000

요약

아름다운 화풍 변환을 저예산에 추시할 수 있었습니다.
GPU를 사거나 클라우드에서 GPU 클러스터를 빌려도 서민에게는 여전히 높습니다. 조금 시도하면 CPU로 돈을 돌려 시간에 해결하고 싶은 것입니다. Torch에도 Chainer나 TensorFlow와 같은 CPU와 GPU를 부담없이 전환하는 기구(혹은 규약)가 탑재되면 기쁩니다.

Reference

이 문제에 관하여(Deep Photo Style Transfer를 저예산 환경에서 억지로 이동), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://qiita.com/colspan/items/ee7c53ff0f7ced9afdbb

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

LÖVE에서 안드로이드에서 Hello World!

Nyagos 리포지토리 분기 이름 표시 프롬프트의 예

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다