Skip to content

lzyang2000/cs194FinalProject

Repository files navigation

Words to Birds: Modifying AttnGAN to use Image Captioning and BERT

Pytorch implementation for AttnGAN in AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He. (This work was performed when Tao was an intern with Microsoft Research) with pre-trained image caption networks and BERT.

Dependencies

python 3.6+

Pytorch 1.0+

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

  • python-dateutil
  • easydict
  • pandas
  • torchfile
  • nltk
  • scikit-image

Data

  1. Download preprocessed metadata for birds
  2. Download the birds image data. Extract them to data/birds/

Training

  • (Reproduction only) Pre-train DAMSM models:

    • For bird dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0
  • Train AttnGAN models:

    • For reproduction: python main.py --cfg cfg/bird_attn2.yml --gpu 0
    • For modified: python main.py --cfg cfg/bird_attn2_bert.yml --gpu 0
  • *.yml files are example configuration files for training/evaluation our models.

Reference

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages