Skip to content

Latest commit

 

History

History
108 lines (82 loc) · 3.95 KB

README.md

File metadata and controls

108 lines (82 loc) · 3.95 KB

IP-IQA

[ICME2024, Official Code] for paper "Bringing Textual Prompt to AI-Generated Image Quality Assessment".

View the Poster


💡 I also have other project that may interest you ✨.

TriVQA
CVPRW2024, the 3rd-place winner of the NTIRE 2024 Quality Assessment for AI-Generated Content - Track 2 Video.

MPP-Qwen-Next
My personal project, about traing 8B/14B MLLM on RTX 3090/4090 24GB by DeepSpeed Pipeline Parallel. Support {image/video/multi-image} input.

Installation

You can use conda to configure the virtual environment with only three lines of commands. As following:

conda create -n ipiqa python=3.9
conda activate ipiqa
pip install -e .

Weights & Data

CLIP ResNet50 weights

Download Link: RN50.pt

After that, you can place it to cache/ckpt/clip/openai/resnet/RN50.pt or modify the base_ckpt in yaml file like ipiqa.yaml.

AGIQA-1k Database

Please get the data refer to its Official Repo.

After that, please set your path in dataset path and images root of the yaml file.

AGIQA-3k Database

Please get the data refer to its Official Repo.

After that, please set your path in dataset path and images root of the yaml file.

Additionally, you need to get a mos_joint.xlsx file, which is organized like following:

Data Organization for Reference

├── cache
│   |── data
│   |   ├── aigc_qa_3k # AGIQA-3k
│   │   │   ├── AGIQA-3k # the vis_root
│   │   │   |   ├── xxx.jpg
│   │   │   ├── mos_joint.xlsx
│   │   │   ├── data.csv
│   │   │   |
│   │   │   ├── aigc_QA_data1 # AGIQA-1k
│   │   │   |   ├── AGIQA-1k-Database-main # git clone their repo
│   │   │   |   ├── images # the vis_root

Train & K-folds Evaluation

AGIQA-1k

run:

python train_agiqa1k.py --cfg-path ipiqa/projects/agiqa1k/ipiqa.yaml --num_cv 10

DDP:

python -m torch.distributed.run --nproc_per_node 2 train_agiqa1k.py --cfg-path ipiqa/projects/agiqa1k/ipiqa.yaml --num_cv 10

AGIQA-3k

run:

python train_agiqa3k.py --cfg-path ipiqa/projects/agiqa3k/ipiqa.yaml --num_cv 10

DDP:

python -m torch.distributed.run --nproc_per_node 2 train_agiqa3k.py --cfg-path ipiqa/projects/agiqa3k/ipiqa.yaml --num_cv 10

Acknowledgement

  • MPP-Qwen: My personal MLLM Project. The trainer and prototype of this repo is a reference to it.
  • LAVIS: An excellent repo for multimodal learning. Refer to its Trainer implementation.
  • AGIQA-1k and AGIQA-3k: Thanks to their database!
  • OpenAI-CLIP: Use their pretrained weights.

Citation

@misc{qu2024bringingtextualpromptaigenerated,
      title={Bringing Textual Prompt to AI-Generated Image Quality Assessment}, 
      author={Bowen Qu and Haohui Li and Wei Gao},
      year={2024},
      eprint={2403.18714},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2403.18714}, 
}