Generative AI is now playing a part in creating photographs of human faces, which is empowering things like face-swapping applications or deep fakes and Artistic Style Transfer. Not only that, Natural Voice Generation that you see in Google Duplex or AI-enabled Music Synthesis or technologies like Smart Reply and Smart Compose are all being created with the help of technology like Generative AI.
Enter GAN, the AI technology behind the curtain
GAN is also known as Generative Adversarial Network. It was invented by Ian J. Goodfellow and a bunch of other programmers involved in developing GANs. A GAN is a different approach to learning than other neural networks. GAN uses two neural networks, a Generator and a Discriminator, to come up with the result that is desired out of the task. The Generator creates fake images that look quite realistic. The Discriminator, on the other hand, distinguishes between fake and real images. These generated images seem to be identical to real-life photos.
What was the purpose behind the development of the GANs?
Many AI and ML enthusiasts observed that Neural Nets could easily be misguided by adding a bit of noise to the original data. The wrong predictions seemed to be more confidently put out as correct results. This is because the low quantum of data is a huge drawback here, which leads to predictive malpractice called overfitting. The input and output mapping are also linear, therefore leading to data misclassification.
How do GANs work?
GANs use probability distribution by pitting datasets of two neural networks against each other. As already mentioned, the first Neural Network is the Generator, and the second one is the Discriminator. The Generator's function is to create a new dataset from the original one. The Discriminator reviews whether the specific dataset is the actual training dataset. That means it tries to find out whether the image is a real one or not.
The Generator creates photographs of the human faces
It is the Generator which is generating new human faces. The fake image that is generated is done from a technique called transposed convolution. In this process, the new dataset or the facial dataset is derived from a 100-dimensional noise where the uniform distribution ranges between -1.0 to 1.0 using the inverse of convolution.
What's unique about this is that it will not completely resemble any human face in this world. In other words, the dataset of the generated face will not match the dataset of any other face that already exists. This seems to be a cool feature! Imagine if it fell into the wrong hands; what could be done with it? It would be 'the' 1 technology for many forgeries in the future. It could also be used to create a 100% authentic replica of an expensive piece of art. But is it going to be possible?
That brings us to the next aspect of Generative AI or GANs. While creating a new and fake face, the Generator tries to ensure the facial dataset can clear the discriminator test. That means it tries to fool the Discriminator in the hope that it will deem it as authentic, despite being genuinely fake. It does so in the hopes that they, too, will be considered authentic, even though they are not. It tries to lie without actually being caught by the neural network.
But how do things work behind the scenes?
The ML Model in the Generator is coupled with a feedback loop in the Discriminator which takes in both fake and real images and gives feedback in the form of a probability ratio. This ratio is also called Binary Cross Entropy because the result has a binary classification, i.e., the value ranges from 0 to 1.
So, to sum up, this is an overview of how photographs of human faces are emerging through artificial intelligence. But, if you want to learn more about such topics and other related ones, visit the blog section of the E2E Networks website.