Generative Adversarial Networks (GAN) is a generative model with implicit density estimation, it is part of unsupervised learning, and it uses two neural networks. Hence the terms “generative” and “network” in “generative adversarial network”.
2.1) Principle: Generator and Discriminator
The principle is a game between two players: a neural network called the generator, and a neural network called the discriminator. Generator They try to trick the classifier by generating images that look exactly like the real thing, Discriminator It tries to distinguish between real and fake images, hence the term “adversarial” in “generative adversarial networks.” Figure 2.
Bottom left Figure 2We can see that the generator samples from a simple distribution, i.e. random noise. We can interpret the generator as an artist and the discriminator as an art critic. Figure 3.
During training, the generator gradually gets better at creating real-looking images, while the discriminator gets better at distinguishing between them. The process reaches equilibrium when the discriminator can no longer distinguish between real and fake images. Figure 4So, if the classifier is trained properly and the generator can produce lifelike images that fool the classifier, then we have a good generative model, which is generating images that are similar to the training set.
After this training phase, the generator only needs to sample new (fake) realistic data; the discriminator is no longer needed. Note that the random noise ensures that the generator does not always produce the same images (which could fool the discriminator).
At the beginning of training Figure 4The generator only produces random noise that does not resemble the training data.
2.2) Mathematically: 2-player minimax game
Generator G And the discriminator is Trained together, Minimax game for two players Formulation. The minimax objective function is:
where θ_g The parameters are G and θ_d The parameters are is.
In the following, simply D_{θ_d} As is and G_{θ_g} As G.
By definition, is Output the likelihood of the real image in the interval [0, 1]:
• D(x) equal 1 (or close to it 1) if is I believe X This is the actual data.
• D(x) equal 0 (or close to it 0) if is I believe X Fake data (e.g., generated data).
At equilibrium, is output 1/2 anywhere is I don't know how to distinguish between fake and real data.
because x ∼ p_{data}, X is real data. G, G(z) Forged data. For example, X This is a live-action image of a cat. G(z) A fake image of a cat is generated. D(x) is the output of the discriminator for real inputs. X and D(G(z)) This is the output of the classifier for the forged data. G(z).
Continue [1]a two-player minimax game Formula (1) It was written as follows: θ_g and θ_d The following subsections will evolve: 2.1) it's truth:
• Classifier is Attempt to distinguish from real data X Fake Data G(z).
More precisely, the discriminator is play θ_d (θ_g fixed) as the objective function D(x) close 1 (X It is actual data, D(G(z)) close 0 (The generated data will be found to be false).
• Generator G Trying to deceive the discriminator is It tricks people into believing that fabricated data is real.
More precisely, a generator G play θ_g (θ_d The objective function to minimize (fixed) is defined as follows: D(G(z)) close 1 (The generated erroneous data will be detected as true by the classifier).
Although it is unsupervised learning (data is not labeled), G there is 0 For the fake labels (regardless of what the classifier returns) and the real training data, 1 The label is true, which allows us to define the loss function.
paper [1] We prove that there exists a globally (and uniquely) optimal solution for minimax games. p_g = p_{data} where p_g is a generating distribution, p_{data} The actual data distribution. But in reality p_g Converge p_{data} It is not easy.