Face Generation

Face Generation

In this project, we will define and train a DCGAN on a dataset of faces. Our goal is to get a generator network to generate new images of faces that look as realistic as possible! The project will be broken down into a series of tasks from loading in data to defining and training adversarial networks. At the end of the notebook, we will be able to visualize the results of our trained Generator to see how it performs; our generated samples should look like fairly realistic faces with small amounts of noise.

Get the Data

We will use CelebFaces Attributes Dataset (CelebA) to train your adversarial networks.

Pre-processed Data

Since the project’s main focus is on building the GANs, we’ve done some of the pre-processing for you. Each of the CelebA images has been cropped to remove parts of the image that don’t include a face, then resized down to 64x64x3 NumPy images.

Visualize the CelebA Data

The CelebA dataset contains over 200,000 celebrity images with annotations. Since we are going to be generating faces, we won’t need the annotations, we will only need the images. Note that these are color images with 3 color channels (RGB) each.

Load data

def get_dataloader(batch_size, image_size, data_dir='processed_celeba_small/'):
    """
    Batch the neural network data using DataLoader
    :param batch_size: The size of each batch; the number of images in a batch
    :param img_size: The square size of the image data (x, y)
    :param data_dir: Directory where image data is located
    :return: DataLoader with batched data
    """
    transform = transforms.Compose([ transforms.Resize(image_size),
                                    transforms.ToTensor()])
    
    train_path = os.path.join(data_dir, "celeba/")
    my_dataset = datasets.ImageFolder(train_path, transform = transform)
    
    # TODO: Implement function and return a dataloader
    data_loader = torch.utils.data.DataLoader(dataset = my_dataset, 
                                              batch_size = batch_size,
                                              shuffle = True, num_workers = 4)
    
    return data_loader

Create a DataLoader

# Call your function and get a dataloader
celeba_train_loader = get_dataloader(batch_size, img_size)

Next, we can view some images! we should seen square images of somewhat-centered faces.

# obtain one batch of training images
dataiter = iter(celeba_train_loader)
images, _ = dataiter.next() # _ for no labels

# plot the images in the batch, along with the corresponding labels
fig = plt.figure(figsize=(20, 4))
plot_size=20
for idx in np.arange(plot_size):
    ax = fig.add_subplot(2, plot_size/2, idx+1, xticks=[], yticks=[])
    imshow(images[idx])

png

Pre-process your image data and scale it to a pixel range of -1 to 1

We need to do a bit of pre-processing; we know that the output of a tanh activated generator will contain pixel values in a range from -1 to 1, and so, we need to rescale our training images to a range of -1 to 1. (Right now, they are in a range from 0-1.)

def scale(x, feature_range=(-1, 1)):
    ''' Scale takes in an image x and returns that image, scaled
       with a feature_range of pixel values from -1 to 1. 
       This function assumes that the input x is already scaled from 0-1.'''
    # assume x is scaled to (0, 1)
    # scale to feature_range and return scaled x
    min, max = feature_range
    x = x * (max - min) + min
    return x
# check scaled range
# should be close to -1 to 1
img = images[0]
scaled_img = scale(img)

print('Min: ', scaled_img.min())
print('Max: ', scaled_img.max())
Min:  tensor(-0.9294)
Max:  tensor(1.)

Define the Model

A GAN is comprised of two adversarial networks, a discriminator and a generator.

Discriminator

Our first task will be to define the discriminator. This is a convolutional classifier without any maxpooling layers. To deal with this complex data, it’s suggested we use a deep network with normalization.

tests.test_discriminator(Discriminator)
Tests Passed

Generator

The generator should upsample an input and generate a new image of the same size as our training data 32x32x3. This should be mostly transpose convolutional layers with normalization applied to the outputs.

tests.test_generator(Generator)
Tests Passed

Initialize the weights of your networks

To help our models converge, we should initialize the weights of the convolutional and linear layers in our model. From reading the original DCGAN paper, they say:

All weights were initialized from a zero-centered Normal distribution with standard deviation 0.02.

So, our next task will be to define a weight initialization function that does just this!

def weights_init_normal(m):
    """
    Applies initial weights to certain layers in a model .
    The weights are taken from a normal distribution 
    with mean = 0, std dev = 0.02.
    :param m: A module or layer in a network    
    """
    # classname will be something like:
    # `Conv`, `BatchNorm2d`, `Linear`, etc.
    classname = m.__class__.__name__
    
    # Apply initial weights to convolutional and linear layers
    if classname.find('Linear') != -1 or classname.find('Convo2d') != -1:
        # get the number of the inputs
        n = m.in_features
        y = (1.0/np.sqrt(n))
        m.weight.data.normal_(0, y)
        m.bias.data.fill_(0)

Build complete network

Define our models’ hyperparameters and instantiate the discriminator and generator.

def build_network(d_conv_dim, g_conv_dim, z_size):
    # define discriminator and generator
    D = Discriminator(d_conv_dim)
    G = Generator(z_size=z_size, conv_dim=g_conv_dim)

    # initialize model weights
    D.apply(weights_init_normal)
    G.apply(weights_init_normal)

    print(D)
    print()
    print(G)
    
    return D, G

# Define model hyperparams
d_conv_dim = 32
g_conv_dim = 32
z_size = 100

D, G = build_network(d_conv_dim, g_conv_dim, z_size)
Discriminator(
  (conv1): Sequential(
    (0): Conv2d(3, 32, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
  )
  (conv2): Sequential(
    (0): Conv2d(32, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (conv3): Sequential(
    (0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (fc): Linear(in_features=2048, out_features=1, bias=True)
)

Generator(
  (fc): Linear(in_features=100, out_features=2048, bias=True)
  (t_conv1): Sequential(
    (0): ConvTranspose2d(128, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (t_conv2): Sequential(
    (0): ConvTranspose2d(64, 32, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
    (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (t_conv3): Sequential(
    (0): ConvTranspose2d(32, 3, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1), bias=False)
  )
)

Training on GPU

Check if we can train on GPU. Here, we’ll set this as a boolean variable train_on_gpu.

# Check for a GPU
train_on_gpu = torch.cuda.is_available()
if not train_on_gpu:
    print('No GPU found. Please use a GPU to train your neural network.')
else:
    print('Training on GPU!')
Training on GPU!

Discriminator and Generator Losses

Now we need to calculate the losses for both types of adversarial networks.

Discriminator Losses

  • For the discriminator, the total loss is the sum of the losses for real and fake images, d_loss = d_real_loss + d_fake_loss.
  • Remember that we want the discriminator to output 1 for real images and 0 for fake images, so we need to set up the losses to reflect that.

Generator Loss

The generator loss will look similar only with flipped labels. The generator’s goal is to get the discriminator to think its generated images are real.

def real_loss(D_out):
    '''Calculates how close discriminator outputs are to being real.
       param, D_out: discriminator logits
       return: real loss'''
    batch_size = D_out.size(0)
    labels = torch.ones(batch_size)
    
    if train_on_gpu:
        labels = labels.cuda()

    criterion = nn.BCEWithLogitsLoss()
    loss = criterion(D_out.squeeze(), labels)
    
    return loss

def fake_loss(D_out):
    '''Calculates how close discriminator outputs are to being fake.
       param, D_out: discriminator logits
       return: fake loss'''
    batch_size = D_out.size(0)
    labels = torch.zeros(batch_size)
    
    if train_on_gpu:
        labels = labels.cuda()
        
    criterion = nn.BCEWithLogitsLoss()
    loss = criterion(D_out.squeeze(), labels) 
    return loss

Optimizers

Define optimizers for our Discriminator (D) and Generator (G)

# params
lr = 0.0004
beta1=0.5
beta2=0.999 # default value

# Create optimizers for the discriminator D and generator G
d_optimizer = optim.Adam(D.parameters(), lr, [beta1, beta2])
g_optimizer = optim.Adam(G.parameters(), lr, [beta1, beta2])

Training

Training will involve alternating between training the discriminator and the generator. We will use our functions real_loss and fake_loss to help us calculate the discriminator losses.

  • We should train the discriminator by alternating on real and fake images
  • Then the generator, which tries to trick the discriminator and should have an opposing loss function

Saving Samples

def train(D, G, n_epochs, print_every=50):
    '''Trains adversarial networks for some number of epochs
       param, D: the discriminator network
       param, G: the generator network
       param, n_epochs: number of epochs to train for
       param, print_every: when to print and record the models' losses
       return: D and G losses'''
    
    # move models to GPU
    if train_on_gpu:
        D.cuda()
        G.cuda()

    # keep track of loss and generated, "fake" samples
    samples = []
    losses = []

    # Get some fixed data for sampling. These are images that are held
    # constant throughout training, and allow us to inspect the model's performance
    sample_size = 16
    fixed_z = np.random.uniform(-1, 1, size=(sample_size, z_size))
    fixed_z = torch.from_numpy(fixed_z).float()
    # move z to GPU if available
    if train_on_gpu:
        fixed_z = fixed_z.cuda()

    # epoch training loop
    for epoch in range(n_epochs):

        # batch training loop
        for batch_i, (real_images, _) in enumerate(celeba_train_loader):

            batch_size = real_images.size(0)
            real_images = scale(real_images)
            
            # 1. Train the discriminator on real and fake images
            d_optimizer.zero_grad()
            
            # Compute the discriminator losses on real images 
            if train_on_gpu:
                real_images = real_images.cuda()
                
            D_real = D(real_images)
            d_real_loss = real_loss(D_real)

            # Generate fake images
            z = np.random.uniform(-1, 1, size=(batch_size, z_size))
            z = torch.from_numpy(z).float()

            # move x to GPU, if available
            if train_on_gpu:
                z = z.cuda()
            fake_images = G(z)
            
            D_fake = D(fake_images)
            d_fake_loss = fake_loss(D_fake)
            
            # add up loss and perform backprop
            d_loss = d_real_loss + d_fake_loss
            d_loss.backward()
            d_optimizer.step()            
            
            # 2. Train the generator with an adversarial loss
            g_optimizer.zero_grad()
            
            # Generate fake images
            z = np.random.uniform(-1, 1, size=(batch_size, z_size))
            z = torch.from_numpy(z).float()
            if train_on_gpu:
                z = z.cuda()
            fake_images = G(z)
            
            
            # Compute the discriminator losses on fake images 
            # using flipped labels!
            D_fake = D(fake_images)
            g_loss = real_loss(D_fake) # use real loss to flip labels
        
            # perform backprop
            g_loss.backward()
            g_optimizer.step()
            

            # Print some loss stats
            if batch_i % print_every == 0:
                # append discriminator loss and generator loss
                losses.append((d_loss.item(), g_loss.item()))
                # print discriminator and generator loss
                print('Epoch [{:5d}/{:5d}] | d_loss: {:6.4f} | g_loss: {:6.4f}'.format(
                        epoch+1, n_epochs, d_loss.item(), g_loss.item()))


        ## AFTER EACH EPOCH##    
        # this code assumes your generator is named G, feel free to change the name
        # generate and save sample, fake images
        G.eval() # for generating samples
        samples_z = G(fixed_z)
        samples.append(samples_z)
        G.train() # back to training mode

    # Save training generator samples
    with open('train_samples.pkl', 'wb') as f:
        pkl.dump(samples, f)
    
    # finally return losses
    return losses
# call training function
losses = train(D, G, n_epochs=50)
Epoch [    1/   50] | d_loss: 1.4756 | g_loss: 0.7581
Epoch [    1/   50] | d_loss: 0.6835 | g_loss: 4.3863
Epoch [    1/   50] | d_loss: 0.5454 | g_loss: 2.2342
Epoch [    1/   50] | d_loss: 0.7611 | g_loss: 1.7250
Epoch [    1/   50] | d_loss: 0.8612 | g_loss: 1.9986
Epoch [    1/   50] | d_loss: 1.0474 | g_loss: 1.2333
Epoch [    1/   50] | d_loss: 1.0420 | g_loss: 1.4665
Epoch [    1/   50] | d_loss: 1.0175 | g_loss: 1.2827
Epoch [    1/   50] | d_loss: 1.0866 | g_loss: 1.4246
Epoch [    2/   50] | d_loss: 1.1408 | g_loss: 1.3269
Epoch [    2/   50] | d_loss: 1.1453 | g_loss: 1.3151
Epoch [    2/   50] | d_loss: 1.1817 | g_loss: 1.5866
Epoch [    2/   50] | d_loss: 1.1339 | g_loss: 0.9222
Epoch [    2/   50] | d_loss: 0.8745 | g_loss: 1.8895
.........
Epoch [   24/   50] | d_loss: 3.0976 | g_loss: 6.6547
Epoch [   24/   50] | d_loss: 0.7348 | g_loss: 1.2033
Epoch [   24/   50] | d_loss: 0.6164 | g_loss: 2.5696
Epoch [   24/   50] | d_loss: 0.6439 | g_loss: 1.2263
Epoch [   24/   50] | d_loss: 0.6570 | g_loss: 2.5571
Epoch [   25/   50] | d_loss: 0.6189 | g_loss: 1.2975
Epoch [   25/   50] | d_loss: 0.5116 | g_loss: 1.7852
Epoch [   25/   50] | d_loss: 0.3487 | g_loss: 2.4798
Epoch [   25/   50] | d_loss: 0.7311 | g_loss: 1.1017
Epoch [   50/   50] | d_loss: 0.3821 | g_loss: 2.6701
Epoch [   50/   50] | d_loss: 0.4035 | g_loss: 2.1759
Epoch [   50/   50] | d_loss: 0.4827 | g_loss: 2.8167
Epoch [   50/   50] | d_loss: 0.2443 | g_loss: 2.5251

Training loss

Plot the training losses for the generator and discriminator, recorded after each epoch.

fig, ax = plt.subplots()
losses = np.array(losses)
plt.plot(losses.T[0], label='Discriminator', alpha=0.5)
plt.plot(losses.T[1], label='Generator', alpha=0.5)
plt.title("Training Losses")
plt.legend()

png

Generator samples from training

View samples of images from the generator.

_ = view_samples(-1, samples)

png

The generated faces appear that it is made of celebrity faces that are mostly white. To imrpvove on that, we need to add more images in the dataset so that it has almost equal number of white and non-white celebrity faces. Also we can make the background same color in all these images.

Model size is good as output face picture size is small 32x32, unfortunately, most of face picture don’t have its chin, so I couldn’t see how chin impact on whole face.

ADAM optimizer is the preferred optimizer for DCGAN model. We can also try some other optimizer for testing purpose.


© 2020. Zakaria Alsahfi. All rights reserved.