Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mIoU of PSPNet101 on PSACAL VOC 2012 #59

Open
dailingjun opened this issue Aug 24, 2020 · 2 comments
Open

mIoU of PSPNet101 on PSACAL VOC 2012 #59

dailingjun opened this issue Aug 24, 2020 · 2 comments

Comments

@dailingjun
Copy link

In this code, it's 0.7907(ss)/0.7963(ss). While, it's 0.826(ss) in your paper. What's the difference between them?

@dailingjun
Copy link
Author

I got some information in the FAQ.md. https://github.com/hszhao/semseg/blob/master/FAQ.md

Q: Performance difference with original papers?
A: Lots of details, some are listed as:

1.Pre-trained models: the used weights are different between this PyTorch codebase and former PSP/ANet Caffe version.
2.Pre-processing of images: this PyTorch codebase follows PyTorch official image pre-processing styles (normalized to 0~1 followed by subtracting mean as [0.485, 0.456, 0.406] and divided by std as [0.229, 0.224, 0.225]), while former Caffe version do normalization simply by subtracting image mean as [123.68, 116.779, 103.939].
3.Training steps: we use training steps in Caffe version and training epochs in PyTorch for measurement. The transformed optimization steps after conversion is slightly different (e.g., in ade20k 150k with 16 batches equals to 150k*16/20210=119 epochs).
4.SGD optimization difference: see note in SGD implementation, this difference may has influences on poly style learning rate decay especially on the last steps where learning rates are very small.
5.Weight decay on biases, scale and shift of BN in two training settings, see technical reports 1, 2.
6.Label guidance: former Caffe version mainly uses 1/8 scale label guidance (former interp layer in Caffe has only CPU implementation thus we avoid using larger label guidance), the released segmentation models in this repository mainly use full scale label guidance (interpolate the final logits to original crop size for loss calculation instead of feature downsampling size as 1/8).
7..The performance variance for attention based models (e.g., PSANet) is relatively high, this can also be observed in CCNet. Besides, some low frequent classes (e.g, 'bus' in cityscapes) may also affect the performance a lot.

Can you tell us which one is the most significant?

@bea-CC
Copy link

bea-CC commented Nov 18, 2021

Do you understand now? I have the same question

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants