#Adversarial Feature Learning
1 Introduction
- Unsupervised feature learning framework: Bidirectional Generative Adversarial Networks, which can learn an inverse mapping regressing from generated data back to latent input z
- The BiGan encoder may serve as a useful feature representation for the related semantic tasks. The Z can be thought of as a label for x, which came for free, without the need for supervision.
- BiGANs are a robust and highly generic approach to unsupervised feature learning, making no assumptions about the structure or type of data to which they are applied.
2 BiGAN
In Bidirectional Generative Adversarial Networks, which not only train a generator but additionally train an encoder E
The training objective can be defined as a minimax objective:min max (E,G,D)
1. Optimal discriminator, generator & encoder
This optimal discriminator then allows us to reformulate objective, and show that it reduces to the Jensen-Shannon divergence between the joint distribution $P_{EX}$ and $P_{GZ}$
Proposition1 For any E and G, the optimal discriminator $D_{EG}^\ast;=;arg;max_D;V(D,E,G)$ is the Radon-Nikodym derivative $f_{EG};=;\frac{d;P_{EX}}{d;(P_{EX}+P_{GZ})}$ of measure $P_{EX}$ with respect to measure $P_{EX}+P_{GZ}$
Proposition2 The encoder and generator's objective for an optimal discriminator can be rewritten in terms of the Jensen-Shannon divergence between measures as $P_{EX}$ and $P_{GZ}$ as:
$$
C(E,G);=;2D_{JS}(P_{EX}\vert\vert P_{GZ})-\log;4
$$
when the $P_{EX}$ = $P_{GZ}$, the $C(E,G)$ = $log4$
2.Optimal generator & encoder are inverses
In order to fool a perfect discriminator, the optimal generator and encoder invert one another almost everywhere
Theorem 2 if E and G are an optimal encoder and generator, the $E;=;G^{-1}$ almost everywhere, which means $G(E(x));=x$ and $E(G(z));=z$
3. Relationship to Autoencoders
The encoder and generator objective given an optimal discriminator $C(E,G)$ can be rewritten as an autoencoder loss function
4.Learning
An objective in which the real and generated labels Y are swapped provides stronger gradient signal to G and E. For efficiency, the author update all modules D G, and E simultaneously at each iteration.
5.Generalized BiGAN
It is often useful to parametrize the output of the generator G and encoder E in a different, smaller space rather than the original space. On the other hand, generating high-resolution images remain difficult for current generative models, leading the encoder take higher resolution input while the generator output and discriminator input remain low resolution.
Efficient GAN-Based Anomaly Detection
In this work, the author developed a GAN method based on the BiGAN structure which simultaneously learns an encoder during training to develop an anomaly detection
1.Experiments
The authors apply this method to an image dataset(MNIST) and a network intrusion dataset(KDD99)
MNIST: Generated 10 different datasets from MNIST by successively making each digit class an anomaly and treating the remaining 9 digits as normal examples.
KDD99: This GANs model can also perform well on high-dimensional, non-image data. Due to the proportion of outliers in the dataset "normal" data are treated as anomalies in this task
The following table shows this model outperforms AnoGAN and has a faster inference time
##2.Conclusion
The recent GAN models can be used to achieve state-of-the-art performance for anomaly detection on high-dimensional, complex datasets whilst being efficient at test time.
References
Donahue, J., Krhenbhl, P., and Darrell, T. Adversarial Feature Learning. abs/1605.09782, 2016.
Zenatu, H.,chandarsekhar, V.R. Efficient GAN-Based Anomaly Detection. abs/1802.06222, 2018