Generative Adversarial Networks: An Overview 노트
6474 단어 GAN
작성자: Antonia Creswell, Tom White, Vincent Dumoulin, Kai Arulkumaran, Biswa Sengupta, Anil A Bharath
GAN은 먼저 Ian Goodfellow가 제기했는데 깊이 있는 학습이 발전함에 따라 GAN은 점차적으로 핫이슈로 발전했다. 이미지 합성, 의미 이미지 편집, 스타일 전환, 이미지 초해상도와 분류 등 모두 큰 역할을 한다.
Abstract
Generative adversarial networks (GANs) provide a way to learn deep representations without extensively annotated training data. They achieve this through deriving backpropagation signals through a competitive process involving a pair of networks. The representations that can belearned by GANs may be used in a variety of applications, including image synthesis, semantic image editing, style transfer, image super-resolution and classification. The aim of this review paper is to provide an overview of GANs for the signal processing community, drawing on familiar analogies and concepts where possible. In addition to identifying different methods for training and constructing GANs, we also point to remaining challenges in their theory and application. Index Terms—neural networks, unsupervised learning, semi-supervised learning.
Introduction
Generative adversarial networks (GANs) are an emerging technique for both semi-supervised and unsupervised learning.
GAI는 생성기(generator)와 판별기(discriminator)를 포함한다. 민간에서는 일반인이 가짜 돈을 생산하는 도둑이라고 통속적으로 설명하는데, discriminator는 돈을 판별하는 경찰이다. 경찰의 판별력이 갈수록 강해지면서 생성기의 모방 능력도 끊임없이 증가하여 진짜 돈(얼마 안 되는) 가짜 돈을 만들어 낼 수 있게 되었다.
Preliminaries
A. Terminology
Generative models learn to capture the statistical distribution of training data, allowing us to synthesize samples from the learned distribution. In all cases, the network weights are learned through backpropagation.
B. Notation
C. Capturing Data Distributions
GANs learn through implicitly computing some sort of similarity between the distribution of a candidate model and the distribution corresponding to real data.
D. Related Work
-->Fourier-based_and_wavelet_representations
-->Principal_Components_Analysis_(PCA)
-->Independent_Components_Analysis_(ICA)
-->noise_contrastive_estimation_(NCE)
-->GANs
GAN architectures
A. Fully Connected GANs
The first GAN architectures used fully connected neural networks for both the generator and discriminator.
B. Convolutional GANs
C. Conditional GANs
A parallel can be drawn between conditional GANs and InfoGAN , which decomposes the noise source into an incompressible source and a “latent code”, attempting to discover latent factors of variation by maximizing the mutual information between the latent code and the generator’s output.
D. GANs with Inference Models
In this formulation, the generator consists of two networks: the “encoder” (inference network) and the “decoder”. They are jointly trained to fool the discriminator.
E. Adversarial Autoencoders (AAE)
Autoencoders are networks, composed of an “encoder” and “decoder”, that learn to map data to an internal latent representation and out again. That is, they learn a deterministic mapping (via the encoder) from a data space into a latent or representation space, and a mapping (via the decoder) from the latent space back to data space.
Training GANS
A. Introduction
Training of GANs involves both finding the parameters of a discriminator that maximize its classification accuracy, and finding the parameters of a generator which maximally confuse the discriminator.
One approach to improving GAN training is to asses the empirical “symptoms” that might be experienced during training. These symptoms include: - Difficulties in getting the pair of models to converge; - The generative model, “collapsing”, to generate very similar samples for different inputs; - The discriminator loss converging quickly to zero, providing no reliable path for gradient updates to the generator.
B. Training Tricks
C. Alternative formulations
D. A Brief Comparison of GAN Variants
Applications of GANs
Discussion
A. Open Questions
B. Conclusions
The explosion of interest in GANs is driven not only by their potential to learn deep, highly non-linear mappings from a latent space into a data space and back, but also by their potential to make use of the vast quantities of unlabelled image data that remain closed to deep representation learning.
류리 2017-10-24