#Adversarial Feature Learning
1 Introduction
- Unsupervised feature learning framework: Bidirectional Generative Adversarial Networks, which can learn an inverse mapping regressing from generated data back to latent input z
- The BiGan encoder may serve as a useful feature representation for the related semantic tasks. The Z can be thought of as a label for x, which came for free, without the need for supervision.
- BiGANs are a robust and highly generic approach to unsupervised feature learning, making no assumptions about the structure or type of data to which they are applied.
In Bidirectional Generative Adversarial Networks, which not only train a generator but additionally train an encoder E
The training objective can be defined as a minimax objective:min max (E,G,D)
1. Optimal discriminator, generator & encoder
This optimal discriminator then allows us to reformulate objective, and show that it reduces to the Jensen-Shannon divergence between the joint distribution $P_{EX}$ and $P_{GZ}$
Proposition1 For any E and G, the optimal discriminator $D_{EG}^\ast;=;arg;max_D;V(D,E,G)$ is the Radon-Nikodym derivative $f_{EG};=;\frac{d;P_{EX}}{d;(P_{EX}+P_{GZ})}$ of measure $P_{EX}$ with respect to measure $P_{EX}+P_{GZ}$
Proposition2 The encoder and generator's objective for an optimal discriminator can be rewritten in terms of the Jensen-Shannon divergence between measures as $P_{EX}$ and $P_{GZ}$ as:
C(E,G);=;2D_{JS}(P_{EX}\vert\vert P_{GZ})-\log;4
when the $P_{EX}$ = $P_{GZ}$, the $C(E,G)$ = $log4$
2.Optimal generator & encoder are inverses
In order to fool a perfect discriminator, the optimal generator and encoder invert one another almost everywhere
Theorem 2 if E and G are an optimal encoder and generator, the $E;=;G^{-1}$ almost everywhere, which means $G(E(x));=x$ and $E(G(z));=z$
3. Relationship to Autoencoders
The encoder and generator objective given an optimal discriminator $C(E,G)$ can be rewritten as an autoencoder loss function
An objective in which the real and generated labels Y are swapped provides stronger gradient signal to G and E. For efficiency, the author update all modules D G, and E simultaneously at each iteration.
5.Generalized BiGAN
It is often useful to parametrize the output of the generator G and encoder E in a different, smaller space rather than the original space. On the other hand, generating high-resolution images remain difficult for current generative models, leading the encoder take higher resolution input while the generator output and discriminator input remain low resolution.
Efficient GAN-Based Anomaly Detection
In this work, the author developed a GAN method based on the BiGAN structure which simultaneously learns an encoder during training to develop an anomaly detection
The authors apply this method to an image dataset(MNIST) and a network intrusion dataset(KDD99)
MNIST: Generated 10 different datasets from MNIST by successively making each digit class an anomaly and treating the remaining 9 digits as normal examples.
KDD99: This GANs model can also perform well on high-dimensional, non-image data. Due to the proportion of outliers in the dataset "normal" data are treated as anomalies in this task
The following table shows this model outperforms AnoGAN and has a faster inference time
The recent GAN models can be used to achieve state-of-the-art performance for anomaly detection on high-dimensional, complex datasets whilst being efficient at test time.
Donahue, J., Krhenbhl, P., and Darrell, T. Adversarial Feature Learning. abs/1605.09782, 2016.
Zenatu, H.,chandarsekhar, V.R. Efficient GAN-Based Anomaly Detection. abs/1802.06222, 2018