top of page
7f03b6422273caf1c555ad4d3f85b217.jpg

GENERATING FACES USING AI

Updated: Sep 10, 2020





Can you make completely new faces using AI? Well if you saw the above image you know the answer is obviously yes. In this article, I will be explaining some of the processes of how I used a machine learning model to generate the above faces. In short, what this generative model (known as a decoder) does is it takes an arbitrary amount of numbers – in this case, an array (list) of 30 different numbers – and by sending those numbers through many different layers of multiplication and other math, transforms (decodes) them into an arbitrary larger amount of numbers that look like a face – in this case, an image of 192x192 dimensions with 3 color channels equaling an image made of 110592 numbers. The image following shows how the decoder transforms the numbers. Just think of ‘Dense’ as a layer outputting another array of numbers, and ‘Conv2DTranspose’ as a layer outputting a stack of 2D images. The final output is a stack of 3 2D images corresponding to the 3 different RGB colors. We can’t just give the decoder 30 numbers and tell it to turn them into a fully-fledged face without teaching it to learn how to properly transform the numbers. The question arises, how would one train it to decode the numbers?

Training this type of model usually can’t be done without an objective. What do you want an autoencoder to achieve? Well, in this case, to achieve our goal of generating completely new faces, the objective is compressing an image of a face into 30 numbers, followed by decompressing it to try to produce an image that is as close to the original image as possible. Think of it as squishing a playdoh face and then un-squishing it to the exact same face it was before. Over the course of many attempts at un-squishing, you would become very good at changing squished playdoh to the original thing. This is the same concept used in training an autoencoder. After leaving an autoencoder to train to learn how to encode and decode an image in the most efficient way, in my case training for over a dozen hours on my computer with over 16 000 images, the autoencoder has learned to change a face to 30 numbers, and 30 numbers back into what closely resembles that face. Keep in mind an autoencoder cannot learn to recreate the exact image as 30 numbers can only hold a limited amount of information. The process of reconstructing images might seem useless on the surface level; what’s the use of making the exact same face you feed in? How can you generate completely new faces? This is when you separate the decoder from the encoder. Instead of changing an image to 30 numbers, you make up your own 30 numbers. You can then feed in those numbers to the now-trained decoder to make a completely brand-new face from scratch. The faces on the first part of this article are examples of feeding randomly generated numbers to the decoder. You can also set the numbers to correspond to sliders in an interface which you can then use to personalize a face based on your specifications.  This isn't the only attempt at generating faces. There are also YouTube videos that show an autoencoder in action such as this and this. Just a few months ago, the famous tech company Nvidia had an AI team make hyper-realistic faces using a different type of decoder-using model known as a Generative Adversarial Network. They trained a model for a complete week with hardware with costs in the tens of thousands in US dollars. The website thispersondoesnotexist.com generates a brand new completely artificial face using the Nvidia model upon every time you reload the page. If you want to see more generated examples from my model or the architecture of the combined autoencoder model, you can view them along with the code for the project at my GitHub repository, github.com/Hzaakk/FaceGenKerasVae




 
 
 

Comments


Post: Blog2 Post
bottom of page