Skip to content

KAIST-Visual-AI-Group/Diffusion-Project-Drawing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

✏️ Sequential Sketch Stroke Generation

KAIST CS492(D): Diffusion Models and Their Applications (Fall 2024)
Course Project

Instructor: Minhyuk Sung (mhsung [at] kaist.ac.kr)
TA: Yuseung Lee (phillip0701 [at] kaist.ac.kr)

Collection of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw!. Drawings were captured as timestamped vectors. Source: Quick, Draw! Dataset.

Updates

  • [24.11.06] Updated test data indices to remove overlaps with training indices.

Description

In this project, your goal is to implement a conditional diffusion model that generates sequential strokes to form a sketch. You will utilize user-captured sketch strokes, complete with timestamps, from 345 different object categories provided by the Quick, Draw! dataset. Rather than generating the entire sketch at once, the focus should be on leveraging the sequential (or part-aware) stroke information for training the model, encouraging a stroke-by-stroke generation process that reflects how users naturally draw.

Tasks

Your task is to implement a conditional diffusion model that generates sequential strokes to form a sketch. While there are 345 different categories, we choose three categories (cat, garden, helicopter) and train a diffusion model on each category. So you will have three different diffusion models in total.

Along with a detailed report, include the quantitative evaluations (FID, KID) as described in the below section.

Data Specification

(1) Download Quick, Draw! Dataset

The dataset consists of sketches from 345 different categories, and each sketch is drawn with varying number of strokes.

Use the following bash script to download the Quick, Draw! dataset:

sh download_quickdraw.sh

For extracting the .ndjson file for each category and visualizing the source data, refer to our sample code in load_data.ipynb. An example sequence of strokes is shown below:

(2) Filter Train/Test Subsets for Target Categories

We will use the three categories, cat, garden, helicopter, for training and evaluation. After you've downloaded Quick, Draw! dataset in the ./data directory, simple run the below command to obtain the training and testing data for the three different categories:

sh filter_data_all.sh

Then the indices for the sketches to be in the training, test subsets will the stored in ./sketch/$CATEGORY/train_test_indices.json file. The indices are already provided. For visualization, the images of each sketch will be stored in the images_train, images_test directories, respectively.

Evaluation

While detailed explanations and qualitative results are essential, you must also provide quantitative evaluations of your model. For evaluation, the test images stored in images_test for each category must be used.

Compute the FID (Fréchet Inception Distance) and KID (Kernel Inception Distance) between the test set and your generated sketches. To do this, first install the FID library by running:

pip install clean-fid

Then, compute the FID and KID score using the following command:

python run_eval.py --fdir1 $TEST_DATA_DIR --fdir2 $YOUR_DATA_DIR

Here, $YOUR_DATA_DIR refers to the directory containing the generated sketches from your model, saved as .png or .jpg files. Since you train a diffusion model separately on three different categories, you must return three pairs of FID, KID scores.

Acknowledgement

We appreciate Google Creative Lab for releasing the Quick, Draw! dataset to public. The contents adhere to the Creative Commons License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages