Skip to content
This repository has been archived by the owner on Apr 25, 2023. It is now read-only.

about the way to calculate attention weight #15

Open
FreyWang opened this issue Dec 7, 2018 · 2 comments
Open

about the way to calculate attention weight #15

FreyWang opened this issue Dec 7, 2018 · 2 comments

Comments

@FreyWang
Copy link

FreyWang commented Dec 7, 2018

It seems that the way to calculate attention weight is different from origin paper: softmax(v* tanh(W*[s,h])), relu are used after softmax here, can you give some reasons or reference?

` def forward(self, hidden, encoder_outputs):
timestep = encoder_outputs.size(0)
h = hidden.repeat(timestep, 1, 1).transpose(0, 1)
encoder_outputs = encoder_outputs.transpose(0, 1) # [BTH]
attn_energies = self.score(h, encoder_outputs)
return F.relu(attn_energies).unsqueeze(1)

def score(self, hidden, encoder_outputs):
    # [B*T*2H]->[B*T*H]
    energy = F.softmax(self.attn(torch.cat([hidden, encoder_outputs], 2)), dim=2)
    energy = energy.transpose(1, 2)  # [B*H*T]
    v = self.v.repeat(encoder_outputs.size(0), 1).unsqueeze(1)  # [B*1*H]
    energy = torch.bmm(v, energy)  # [B*1*T]
    return energy.squeeze(1)  # [B*T]`
@xiaodaoyoumin
Copy link

I am also confused about this ,if author come back,please notice me thank you

@patiencefromzhou1229
Copy link

I am also confused about this

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants