Words to Birds: Modifying AttnGAN to use Image Captioning and BERT

Pytorch implementation for AttnGAN in AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He. (This work was performed when Tao was an intern with Microsoft Research) with pre-trained image caption networks and BERT.

python 3.6+

Pytorch 1.0+

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

Data

Training

(Reproduction only) Pre-train DAMSM models:
- For bird dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0
Train AttnGAN models:
- For reproduction: python main.py --cfg cfg/bird_attn2.yml --gpu 0
- For modified: python main.py --cfg cfg/bird_attn2_bert.yml --gpu 0
*.yml files are example configuration files for training/evaluation our models.

Reference

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
DAMSMencoders		DAMSMencoders
code		code
data		data
eval		eval
models		models
.gitignore		.gitignore
000000505539.jpg		000000505539.jpg
194model.png		194model.png
LICENSE		LICENSE
README.md		README.md
Words_to_birds.pdf		Words_to_birds.pdf
bird.jpg		bird.jpg
example_bird.png		example_bird.png
example_caption.png		example_caption.png
example_coco.png		example_coco.png
framework.png		framework.png

Provide feedback