Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented as in the article? #6

Open
danFromTelAviv opened this issue Feb 5, 2019 · 1 comment
Open

Implemented as in the article? #6

danFromTelAviv opened this issue Feb 5, 2019 · 1 comment

Comments

@danFromTelAviv
Copy link

danFromTelAviv commented Feb 5, 2019

First of all thank you for implementing the v2 of this paper and maintaining.
** warning - I am mainly a keras/tf user **
If I am reading this correctly x_offset is the original latent space warped ( non-rigidly ) by offsets that were found by p_conv. So x_offset is of shape [batch_size x height x width x features]. The warp happens only in the height and width dimensions ( naturally ) . You then use a regular convolution on top of that.

From reading the paper I think the author intended that the offsets be unique for each filter pixel. That is that the procedure should be :

  1. find offsets
  2. fetch the feature space per filter pixel ( should be [batch_size x height x width x features x filters size]
  3. multiply each feature by the relevant weight
    This way two nearby pixels in the latent space can overlap if they wanted.

Am I wrong? It seems like all of the implementations online do something similar to what you did so I assume I am wrong.
Thanks,
Dan

@LWJ312
Copy link

LWJ312 commented Feb 7, 2020

Hi Dan, I totally agree with your thoughts and three steps above. And the implement code is same as your idea that the offsets should be unique for each conv filter pixel.
I'd like to remind you that in the code, after reshape the size, the x_offset ' shape is [ b, c, hkernel_size, w kernel_size ]. And finally with a conv layer (stride is same as kernel_size), the output can keep the same shape as the input x which is [ b, c, h, w].
:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants