Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only the first vector out of the nn.Embedding is ever used #5

Open
vnmusat opened this issue Jun 21, 2022 · 2 comments
Open

Only the first vector out of the nn.Embedding is ever used #5

vnmusat opened this issue Jun 21, 2022 · 2 comments

Comments

@vnmusat
Copy link

vnmusat commented Jun 21, 2022

Hi,

Is there a specific reason why only the first vector out of the nn.Embedding is ever used? tgt8 ... tgt64 are always zeros at this stage so you end up picking the 0-th vector for each spatial position, or in other words qe8 .. qe64 will always be filled with identical repeating values.... the values in the Embedding will obviously change over time with training but they will always be repeated throughout the qe-s

tgt8 = torch.zeros_like(feat8[:, 0, :1]).expand(

qe8 = (self.query_embed(tgt8.long())).permute(0, 3, 1, 2)

After permuting qe8 to [batch, spatial, spatial, len_embedding]:

tensor([[[ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         ...,
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921]],

        [[ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         ...,
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921]],

        [[ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         ...,
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921],
         [ 0.8230, -1.4024, -0.8630,  ..., -1.0915, -0.6329,  0.0921]]
....
@Chensiyu00
Copy link

Can you get the results shown on paper based on the given code?

@zjr-bit
Copy link

zjr-bit commented Apr 14, 2023

Hi,have you get the answer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants