Watermark for AudioCraft by Dataset Poisoning

This work aims to make generative audio AIs like AudioCraft output watmarked audio by poisoning their training datasets, which can protect copyrights of musicians who publish their works online.

There are three main components:

The generative AI: MusicGen from AudioCraft
The watermark generator/detector: audioseal and wavmark
The dataset: musiccaps released by MusicLM

The main idea is to use watermarked datasets to fine tune the generative AI, then fine tune the detector using output of the fine-tuned generative AI, in this way, we can know whether a piece of audio is generated by a generative AI that is trained on watermarked works. It contains 2 stages to fine tune.

The workflow is

$$ \text{1st-stage dataset prep} \rightarrow \text{fine tune generative AI} \rightarrow \text{2nd-stage dataset prep} \rightarrow \text{fine tune watermark model} $$

Install

This work is based on AudioCraft, especially MusicGen, please install its dependencies first.

To install watermark generator/detector

pip install audioseal
pip install wavmark

Dataset (1st stage)

Download the dataset

We use musiccaps released by MusicLM. However, it only contains metadata of the real audio, to get the real data, run
```
cd your_path_of_the_project
python dataset/downloader.py  # download the dataset into ./dataset/musiccaps
```
Pay attention
- The downloader code is from download-musiccaps-dataset, and I have fixed a bug in my code according to this issue
- It may take a while to download and it may fail to download some audio due to the network.

Dataset preprocessing

Pay attention: all audio we use is mono

The dataset preparation follows the guide. In brief, you need to prepare your dataset like the following directory structure

- root_of_this_project
    - config/dset/audio
        - your_dataset.yaml  # the config of your dataset
    - dataset
        - your_dataset/  # the real data, each audio has one .json and one .wav
    - egs
        - your_dataset/data.jsonl  # metadata of your dataset

You can just run the following codes to prepare the downloaded musiccaps dataset

# extract mono audio from musiccaps into ./dataset/musiccaps_mono_10s
python dataset/data_processor.py --action get_mono

# build dataset without watermark (named 'musiccaps_mono_10s_nonwm')
python dataset/data_processor.py --action build_mono --model none

# build dataset with watermark using audioseal model (named 'musiccaps_mono_10s_audioseal')
python dataset/data_processor.py --action build_mono --model audioseal

# optional: build dataset with watermark using wavmark model (named 'musiccaps_mono_10s_wavmark')
python dataset/data_processor.py --action build_mono --model wavmark

Fine tuning generative AI

According to MusicGen's fine tuning guide, run

dora run solver=musicgen/musicgen_base_32khz model/lm/model_scale=small continue_from=//pretrained/facebook/musicgen-small conditioner=text2music dset=audio/musiccaps_mono_10s_nonwm

where dset is the dataset prepared before (here I use the dataset without watermark).

To change the config of training process, you can see config/solver/musicgen/musicgen_base_32khz.yaml which is corresponding to the solver arg in the dora run ....

Export and test

Export checkpoints

You can find SIG (after instantiating solver for XP) at the beginning of training, then export the best checkpoint by running
```
python export_ft_models.py --sig your_SIG --name output_dir_name
```
Then, the checkpoint will be export to checkpoints/{output_dir_name}_{SIG}
Test Use export_import_test.ipynb to display the output audio of your fine-tuned model

Dataset (2nd stage)

To prepare the stage 2 dataset, run

python dataset/stage2_data_prepare.py --pos_ckpt checkpoints/your_ckpt_path1 --neg_ckpt checkpoints/your_ckpt_path2 --set all

--pos_ckpt: checkpoint fine tuned on watermarked dataset
--neg_ckpt: checkpoint fine tuned on dataset without watermark
--set: generate training set (train), testing set (test) or both (all)

The train_prompts and test_prompts variables in stage2_data_prepare.py are prompts to generate audio and you can customize them.

Then, the stage 2 dataset will be generated in dataset2/ directory.

Fine tune watermark model

To fine tune the detector of audioseal, please see fine_tune_audioseal_detector.

Or the official training instruction may be helpful.

Name		Name	Last commit message	Last commit date
Latest commit History 188 Commits
.github		.github
assets		assets
audiocraft		audiocraft
config		config
dataset		dataset
demos		demos
docs		docs
egs/example		egs/example
model_cards		model_cards
scripts		scripts
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
LICENSE_weights		LICENSE_weights
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
export_ft_models.py		export_ft_models.py
mypy.ini		mypy.ini
original_ft_cmp.ipynb		original_ft_cmp.ipynb
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Watermark for AudioCraft by Dataset Poisoning

Install

Dataset (1st stage)

Fine tuning generative AI

Export and test

Dataset (2nd stage)

Fine tune watermark model

About

Releases

Packages

Languages

License

ObsisMc/watermark4audiocraft_data_poisoning

Folders and files

Latest commit

History

Repository files navigation

Watermark for AudioCraft by Dataset Poisoning

Install

Dataset (1st stage)

Fine tuning generative AI

Export and test

Dataset (2nd stage)

Fine tune watermark model

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages