-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confused about the shape of relative position encoding #21
Comments
Does (C, W) mean global position encoding? |
Thanks for your helpful reply. I did mean global position encoding by (C, W). axial-deeplab/lib/models/axialnet.py Line 44 in fe1d052
The confusion is that, since all the position encodings are initialized randomly, I expected that whatever orders we index the relative encoding should result in similar results. So maybe we can index it with a simpler way. But clearly you don't think so by using this relative_index . What do I miss?
|
They are randomly initialized, but for different position they have different relative positional encoding while the same relative distance ones should share the weights. |
If the span-size |
The following code generates a position embedding of shape
(C, K, K)
, whereC=self.group_planes*2, K=self.kernel_size
:axial-deeplab/lib/models/axialnet.py
Line 64 in fe1d052
It seems that each position in the
(K, K)
window owns a position encoding. But for axial-attention applied along w-axis, shouldn't the shape be(C, W)
, meaning that all rows share a same position encoding ?The text was updated successfully, but these errors were encountered: