The code is heavily based on the Prompt-to-Prompt, Null-Text Inversion [codebase], the MasaCtrl [codebase], and the DDPM-Inversion [codebase].
Input images can be downloaded through this link. Most images are from the StyleDiffusion, Null-Text Inversion, Imagic, MasaCtrl, and SVDiff papers.
Negative-Prompt Inversion (NPI):
from negative_prompt_inversion import main
main(
image_path="./images/cat_mirror.png",
prompt_src="a cat sitting next to a mirror",
prompt_tar="a silver cat sculpture sitting next to a mirror",
output_dir="./outputs/npi_real",
suffix="silver",
guidance_scale=7.5,
cross_replace_steps=0.8,
self_replace_steps=0.6,
blend_word=((('cat',), ("cat",))),
eq_params={"words": ("silver", 'sculpture', ), "values": (2,2,)},
proximal=None,
)
NPI reconstructed (left) | NPI edited (right):
To run ProxNPI:
from negative_prompt_inversion import main
main(
image_path="./images/cat_mirror.png",
prompt_src="a cat sitting next to a mirror",
prompt_tar="a silver cat sculpture sitting next to a mirror",
output_dir="./outputs/npi_real",
suffix="silver",
guidance_scale=7.5,
cross_replace_steps=0.8,
self_replace_steps=0.6,
blend_word=((('cat',), ("cat",))),
eq_params={"words": ("silver", 'sculpture', ), "values": (2,2,)},
proximal='l0',
quantile=0.7,
use_inversion_guidance=True,
recon_lr=1,
recon_t=400,
)
Reconstructed (left) | Edited (right):
To run MasaCtrl:
from prox_masactrl import main
main(
out_dir="./outputs/masactrl_real/",
source_image_path="./images/cake2.png",
source_prompt="a round cake",
target_prompt="a square cake",
npi=False,
npi_interp=0,
)
Input image (left) | Reconstructed (middle) | Edited (right):
To run ProxMasaCtrl:
from prox_masactrl import main
main(
out_dir="./outputs/masactrl_real/",
source_image_path="./images/cake2.png",
source_prompt="a round cake",
target_prompt="a square cake",
npi=True,
npi_interp=1,
prox='l0',
quantile=0.6,
)
Input image (left) | Reconstructed (middle) | Edited (right):
Please see run_npi.py
, run_masa.py
and run_ddpm.py
for examples.
@inproceedings{han2024proxedit,
title={ProxEdit: Improving Tuning-Free Real Image Editing With Proximal Guidance},
author={Han, Ligong and Wen, Song and Chen, Qi and Zhang, Zhixing and Song, Kunpeng and Ren, Mengwei and Gao, Ruijiang and Stathopoulos, Anastasis and He, Xiaoxiao and Chen, Yuxiao and others},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
pages={4291--4301},
year={2024}
}
@article{miyake2023negative,
title={Negative-prompt Inversion: Fast Image Inversion for Editing with Text-guided Diffusion Models},
author={Miyake, Daiki and Iohara, Akihiro and Saito, Yu and Tanaka, Toshiyuki},
journal={arXiv preprint arXiv:2305.16807},
year={2023}
}
@inproceedings{mokady2023null,
title={Null-text inversion for editing real images using guided diffusion models},
author={Mokady, Ron and Hertz, Amir and Aberman, Kfir and Pritch, Yael and Cohen-Or, Daniel},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={6038--6047},
year={2023}
}
@article{hertz2022prompt,
title={Prompt-to-prompt image editing with cross attention control},
author={Hertz, Amir and Mokady, Ron and Tenenbaum, Jay and Aberman, Kfir and Pritch, Yael and Cohen-Or, Daniel},
journal={arXiv preprint arXiv:2208.01626},
year={2022}
}
@article{cao2023masactrl,
title={MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing},
author={Cao, Mingdeng and Wang, Xintao and Qi, Zhongang and Shan, Ying and Qie, Xiaohu and Zheng, Yinqiang},
journal={arXiv preprint arXiv:2304.08465},
year={2023}
}
@article{song2020denoising,
title={Denoising diffusion implicit models},
author={Song, Jiaming and Meng, Chenlin and Ermon, Stefano},
journal={arXiv preprint arXiv:2010.02502},
year={2020}
}
@article{HubermanSpiegelglas2023,
title = {An Edit Friendly DDPM Noise Space: Inversion and Manipulations},
author = {Huberman-Spiegelglas, Inbar and Kulikov, Vladimir and Michaeli, Tomer},
journal = {arXiv preprint arXiv:2304.06140},
year = {2023}
}