Face Generation from Textual Description using Generative Adversarial Networks 📝 2️⃣ 👧👱

Abstract

Majority of current text-to-image generation tasks are limited to creating images like flowers (Oxford 102 Flower), birds (CUB-200-2011), and Common Objects (COCO) from captions. The existing face datasets such as Labeled Faces in the Wild and MegaFace lack description while datasets like CelebA have attributes associated but do not provide feature descriptions. Thus, in this paper we build upon an existing algorithm to create captions with the attributes provided in the CelebA dataset, which can not only generate one caption but it can also be extended to generate N captions per image. We utilise Sentence BERT to encode these descriptions into sentence embeddings. We then perform a comparative study of three models - DCGAN, SAGAN and DFGAN, by using these sentence embeddings along with a latent noise as the inputs to the different architectures. Finally, we calculate the Inception Scores and the FID values to compare the output images across different architectures.

Process Flow

Description Categories

Notebooks

MNIST

Model	Colab Link
Vanilla GAN
DCGAN
CGAN
ACGAN

Face

Model	Single Caption	N Caption
DCGAN
SAGAN
DFGAN

Results

Generated Faces

Scores from the Different Models

Loss vs Epoch

Future Scope

We believe that this work can be further improved by:

Introducing a better dataset balancing strategy that considers very short and extremely long descriptions.
Increasing the training steps for these models.
Extending the resolution of images to 256x256, 512x512 or further.
Using a transformer based model like DALL-E.

Research Paper and Citation

You can find our research paper here.
Paper has successfully been presented at the 5th International Conference on Inventive Communication and Computational Technologies.
Link to Paper - https://link.springer.com/chapter/10.1007/978-981-16-5529-6_43

Citation

@inproceedings{Deorukhkar_Kadamala_Menezes_2022, 
  place={Singapore}, 
  series={Lecture Notes in Networks and Systems}, 
  title={FGTD: Face Generation from Textual Description}, 
  ISBN={9789811655296}, 
  DOI={10.1007/978-981-16-5529-6_43}, 
  publisher={Springer}, 
  author={Deorukhkar, Kalpana and Kadamala, Kevlyn and Menezes, Elita}, 
  editor={Ranganathan, G. and Fernando, Xavier and Shi, Fuqian}, 
  year={2022}, 
  pages={547–562}, 
  collection={Lecture Notes in Networks and Systems} 
}

Acknowledgement

We would like to thank our mentor Prof. Kalpana Deorukhkar for her constant support and guidance throughout the project.

Name		Name	Last commit message	Last commit date
Latest commit History 141 Commits
.vscode		.vscode
Face-GANs		Face-GANs
MNIST-GANs		MNIST-GANs
assets		assets
dataset		dataset
scripts		scripts
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

kad99kev/FGTD

Folders and files

Latest commit

History

Repository files navigation

Face Generation from Textual Description using Generative Adversarial Networks 📝 2️⃣ 👧👱

Table of Contents

Abstract

Process Flow

Description Categories

Notebooks

MNIST

Face

Results

Generated Faces

Scores from the Different Models

Loss vs Epoch

Future Scope

Research Paper and Citation

Citation

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Languages