Generative Adversarial Networks: An Overview 노트

6474 단어 GAN
Generative Adversarial Networks: An Overview
작성자: Antonia Creswell, Tom White, Vincent Dumoulin, Kai Arulkumaran, Biswa Sengupta, Anil A Bharath
GAN은 먼저 Ian Goodfellow가 제기했는데 깊이 있는 학습이 발전함에 따라 GAN은 점차적으로 핫이슈로 발전했다. 이미지 합성, 의미 이미지 편집, 스타일 전환, 이미지 초해상도와 분류 등 모두 큰 역할을 한다.
Abstract
Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can belearned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transfer, image super-resolution and classification. The aim of this review paper is to provide an overview of GANs for the signal processing community, drawing on familiar analogies and concepts where possible. In addition to identifying different methods for training and constructing GANs, we also point to remaining challenges in their theory and application. Index Terms—neural networks, unsupervised learning, semi-supervised learning.
Introduction
Generative adversarial networks (GANs) are an emerging technique for both semi-supervised and unsupervised learning.
GAI는 생성기(generator)와 판별기(discriminator)를 포함한다. 민간에서는 일반인이 가짜 돈을 생산하는 도둑이라고 통속적으로 설명하는데, discriminator는 돈을 판별하는 경찰이다. 경찰의 판별력이 갈수록 강해지면서 생성기의 모방 능력도 끊임없이 증가하여 진짜 돈(얼마 안 되는) 가짜 돈을 만들어 낼 수 있게 되었다.
Preliminaries
A. Terminology
Generative models learn to capture the statistical distribution of training data, allowing us to synthesize samples from the learned distribution. In all cases, the network weights are learned through backpropagation.
B. Notation
  • latent space is z;
  • pdata(x) as representing the probability density function;
  • pg(x) to denote the distribution of the vectors produced by the generator network of the GAN;
  • symbols G and D to denote the generator and discriminator networks, respectively;
  • JG(ΘG; ΘD) and JD(ΘD; ΘG) to refer to the objective functions of the generator and discriminator, respectively.

  • C. Capturing Data Distributions
    GANs learn through implicitly computing some sort of similarity between the distribution of a candidate model and the distribution corresponding to real data.
    D. Related Work
    -->Fourier-based_and_wavelet_representations 
    
    -->Principal_Components_Analysis_(PCA) 
    
    -->Independent_Components_Analysis_(ICA)
    
    -->noise_contrastive_estimation_(NCE)
    
    -->GANs

    GAN architectures
    A. Fully Connected GANs
    The first GAN architectures used fully connected neural networks for both the generator and discriminator.
    B. Convolutional GANs
  • The Laplacian pyramid of adversarial networks (LAPGAN) decompose the generation process using multiple scales.
  • DCGAN (for “deep convolutional GAN”) allows training a pair of deep convolutional generator and discriminator networks.
  • GANs that were able to synthesize 3D data samples using volumetric convolutions.

  • C. Conditional GANs
    A parallel can be drawn between conditional GANs and InfoGAN , which decomposes the noise source into an incompressible source and a “latent code”, attempting to discover latent factors of variation by maximizing the mutual information between the latent code and the generator’s output.
    D. GANs with Inference Models
    In this formulation, the generator consists of two networks: the “encoder” (inference network) and the “decoder”. They are jointly trained to fool the discriminator.
    E. Adversarial Autoencoders (AAE)
    Autoencoders are networks, composed of an “encoder” and “decoder”, that learn to map data to an internal latent representation and out again. That is, they learn a deterministic mapping (via the encoder) from a data space into a latent or representation space, and a mapping (via the decoder) from the latent space back to data space.
    Training GANS
    A. Introduction
    Training of GANs involves both finding the parameters of a discriminator that maximize its classification accuracy, and finding the parameters of a generator which maximally confuse the discriminator.
    One approach to improving GAN training is to asses the empirical “symptoms” that might be experienced during training. These symptoms include: - Difficulties in getting the pair of models to converge; - The generative model, “collapsing”, to generate very similar samples for different inputs; - The discriminator loss converging quickly to zero, providing no reliable path for gradient updates to the generator.
    B. Training Tricks
  • One of the first major improvements in the training of GANs for generating images were the DCGAN architectures.
  • Further heuristic approaches for stabilizing the training of GANs.The first, feature matching; The second, mini-batch discrimination; A third heuristic trick, heuristic averaging; The fourth, virtual batch normalization.
  • Finally, one-sided label smoothing; adding noise to the samples.

  • C. Alternative formulations
  • Generalisations of the GAN cost function;
  • Alternative Cost functions to prevent vanishing gradients.

  • D. A Brief Comparison of GAN Variants
    Applications of GANs
  • A. Classification and Regression
  • B. Image Synthesis
  • C. Image-to-image translation
  • D. Super-resolution

  • Discussion
    A. Open Questions
  • Mode Collapse;
  • Training instability – saddle points;
  • Evaluating Generative Models.

  • B. Conclusions
    The explosion of interest in GANs is driven not only by their potential to learn deep, highly non-linear mappings from a latent space into a data space and back, but also by their potential to make use of the vast quantities of unlabelled image data that remain closed to deep representation learning.
    류리 2017-10-24

    좋은 웹페이지 즐겨찾기