StyleGAN — A very popular GAN, on Colab

Original Source Here

StyleGAN — A very popular GAN, on Colab

Reproducing the paper “A Style-Based Generator Architecture for Generative Adversarial Networks” presented at CVPR 2019


Ever since Generative Adversarial Network (GAN) was introduced by Ian GoodFellow and his colleagues, it has proved to be an efficient solution to many unsupervised learning problems. In spite of achieving superior results compared with the traditional other deep learning models, the generator network in GANs continues to remain as a black box. It takes an input from latent space and generates an image and sometimes the generated samples can be completely random. This lack of control over the generator can hinder the performance of GANs and the StyleGAN model proposed by Karras et al. addresses this issue and provides a solution.

StyleGAN Model

The StyleGAN is a type of adversarial network that gives us control over the generator which means that it can allow us to adjust certain features in an image by tuning the hyperparameters. StyleGAN architecture modifies the generator network such that it no longer takes points from latent space rather it uses two new sources — a standalone mapping network and a noise layer. The mapping network as seen in the figure, maps the latent vector to an intermediate latent space after passing through several fully connected layers. This new latent space is then used to control the features of the image generated by the generator by making use of AdaIn layers.

Adaptive instance normalization is one of the key components used in the StyleGAN model. Another reason for the superiority of StyleGAN, is the usage of a progressive growing method which was introduced by Karras in the paper on ProGAN. This means that the model initially generates a low resolution sample and then progressively increases the resolution to the required value. This ensures that the generator first learns high level features and overcomes simple problems before moving onto focusing on finer details resulting in a more stabilized model. StyleGAN also adds noise after each convolution layer increasing the ‘stochastic variation’ in the generated sample which allows us to change fine details present in the image while tuning the hyperparameters.


A new clean version[6] of the StyleGAN model was implemented. The model was trained on the popular MNIST dataset. The generator starts by generating 4×4 images and then progressively increases resolution to 28×28. Even though the MNIST dataset doesn’t do justice in showing the stunning performance of StyleGAN, the reason for taking it is that it’s computationally less demanding. StyleGAN was trained on high resolution human face images by the original authors, but it was difficult to do so in the Colab notebook. Satisfactory results were not obtained even after several hours. On the other hand MNIST dataset contains 28×28 images which makes the generator’s job easier and the training process faster. The architecture of generator and discriminator is used exactly the same as the original implementation[6] and the training parameters are adjusted for the best performance.

View Code in Github/Colab (Please consider starring the repo, if you find it useful)


After training :

The second figure shows the generated samples with a truncation value of 0.9 and the third figure shows the samples that are generated with a truncation value of 0.5. The truncation value is a factor that decides the similarity between the generated and original samples, in other words lower the truncation value more similar will be the input and output. This is the magic of StyleGANs! While working on human face datasets the truncation value lets us change different attributes of the image like skin color, hair color.

View Code in Github/Colab (Please consider starring the repo, if you find it useful)


Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: