We trained conditional DCGANs to generatively cook recipes for our final project for Stanford's Deep Generative Models class (CS 236)!
Our code is written in Python 3, and we used the Google Cloud Deep Learning VM (PyTorch 0.4.1) with an NVIDIA Tesla K80 GPU, 8 vCPUs, and 30 GB of RAM.
For the purposes of setup, we'll work in a directory called project
(though this can really be named anything). After setup is complete, your directory structure should look something like this:
project
┣━ data
┣━ gan-stronomy
┃ ┣━ code
┃ ┣━ runs
┃ ┣━ scripts
┃ ┗━ temp
┗━ im2recipe-Pytorch
-
Clone the
gan-stronomy
repo:>> cd [project] >> git clone https://github.com/ShinyCode/gan-stronomy
-
Clone the
im2recipe
repo, copy ourgen_embeddings.py
script over, and install the requirements:>> git clone https://github.com/torralba-lab/im2recipe-Pytorch >> cp gan-stronomy/code/gen_embeddings.py im2recipe-Pytorch/ >> pip install -r im2recipe-Pytorch/requirements.txt
Since we're using Python 3 and
im2recipe
uses Python 2, we call their code via a subprocess. -
Create the
data
directory. Since it'll need a lot of space, you optionally can use our utility to mount a disk to that location (if desired):>> mkdir data >> ./gan-stronomy/scripts/mount_disk.sh
-
Create an account at http://im2recipe.csail.mit.edu/dataset/login/ to download the Recipe1M dataset.
-
Download the following files to
project/data
. Due to time and memory constraints, we made splits within the originalim2recipe
validation set, but our code could be extended to use all three splits directly as provided.classes1M.pkl
det_ingrs.json
("Ingredient detections")vocab.bin.gz
. You'll need to extract this to getvocab.bin
.val.tar
. You'll need to extract this to getval_lmdb
andval_keys.pkl
.recipe1M_images_val.tar
. You'll need to extract this to a folder namedval_raw
.model_e500_v-8.950.pth.tar
. You can download theim2recipe
pre-trained model here.
-
Afterwards, the data folder should look like this:
data ┣━ classes1M.pkl ┣━ det_ingrs.json ┣━ model_e500_v-8.950.pth.tar ┣━ val_lmdb ┃ ┣━ data.mdb ┃ ┗━ lock.mdb ┣━ val_raw ┃ ┣━ 0 ┃ ┣━ ... ┃ ┗━ f ┣━ val_keys.pkl ┗━ vocab.bin
-
Finally, make the directory
project/gan-stronomy/temp
. This is where we'll store the GAN-stronomy dataset. -
To actually generate the GAN-stronomy dataset, run
gen_dataset.py
from thecode
folder:>> cd project/gan-stronomy/code >> python3 gen_dataset.py [N] [OUT_PATH]
For example, to generate a dataset of size 10,000, we could run:
>> python3 gen_dataset.py 10000 ../temp/data10000
This will likely give an error the first time around - the
im2recipe
repo has a few minor bugs which you'll need to iron out. If you get an out of memory error, then you'll likely need to change the batch size in theirargs.py
to be smaller.
-
All the parameters associated with training are located in
project/gan-stronomy/code/opts.py
. The key ones you need to be concerned about are:LATENT_SIZE
: The dimension of the noise input to the generator.TVT_SPLIT
: The number of splits to make within the dataset. You can enter either integers or fractions. Training is always done on split 0.DATASET_NAME
: The name of the dataset created bygen_dataset.py
.NUM_EPOCHS
: How many epochs to train for.CONDITIONAL
: Whether to condition on recipe embeddings. If this is set toFalse
, then a warning will be displayed at the start of training.RUN_ID
: The ID of the run. Increment this after every run to avoid trampling old runs.INTV_PRINT_LOSS
: How often (in epochs) to print the lossINTV_SAVE_IMG
: How often (in epochs) to save an output image from the training and validation set.INTV_SAVE_MODEL
: How often (in epochs) to checkpoint the model.NUM_UPDATE_D
: How often to update the discriminator with respect to the generator.
-
Once
opts.py
is set to your satisfaction, just run the training script directly from thecode
folder:>> cd project/gan-stronomy/code >> python3 train.py
The results will be saved in
project/gan-stronomy/runs/runID
, and aopts.py
,model.py
, andtrain.py
will be copied so you remember what the run was for! -
The script will courteously ring the system bell to alert the human when training is complete.
In project/gan-stronomy/code
, we've written an assortment of scripts to probe the performance and behavior of a trained model. In all cases, [SPLIT_INDEX]
refers to the data split to use (train is 0, val is 1, test is 2).
-
test.py
: Runs the generator on a specified split of data, and outputs the generated images, corresponding ground truth images, and associated ingredient lists.>> python3 test.py [MODEL_PATH] [DATA_PATH] [SPLIT_INDEX] [OUT_PATH]
-
sample.py
: Runs the generator on a fixed embedding but different samples of noise, and outputs the generated images.>> python3 sample.py [MODEL_PATH] [DATA_PATH] [SPLIT_INDEX] [OUT_PATH] [RECIPE_ID] [NUM_SAMPLES]
-
interp.py
: Samples noise vectors and and runs the generator on a fixed embedding with noise input (for ). Outputs the generated images.>> python3 interp.py [MODEL_PATH] [DATA_PATH] [SPLIT_INDEX] [OUT_PATH] [RECIPE_ID] [NUM_DIV]
-
score.py
: For a given model, computes and outputs and , as described in the report. Splits 1 and 2 must be nonempty!>> python3 score.py [MODEL_PATH] [DATA_PATH]
Generative Cooking
Examples of generative cooking after training on 49,800 examples from Recipe1M for 90 epochs.
Ablative Study of MSE Loss
Using an MSE loss term in the generator had detrimental effects on output image quality, likely due to the one-to-many nature of the embedding-image mapping.
Latent Space Interpolation
Interpolating between generator noise inputs for a fixed embedding led to smooth blending between dishes.
Effect of Conditioning
Conditioning on recipe embeddings had a noticeable impact on the consistency of the generator outputs.
The CodeCogs LaTeX Editor was used to typeset the equations in this document.