Skip to content
This repository has been archived by the owner on May 28, 2019. It is now read-only.

Why scale the outputs by 30 in Decoder.update_buffer? #40

Open
arvieFrydenlund opened this issue Mar 5, 2018 · 1 comment
Open

Why scale the outputs by 30 in Decoder.update_buffer? #40

arvieFrydenlund opened this issue Mar 5, 2018 · 1 comment

Comments

@arvieFrydenlund
Copy link

The 30 here seems to be a magic number unless I missed something in the paper?

    def update_buffer(self, S_tm1, c_t, o_tm1, ident):
        # concat previous output & context
        idt = torch.tanh(self.F_u(ident))
        o_tm1 = o_tm1.squeeze(0)
        z_t = torch.cat([c_t + idt, o_tm1/30], 1)
        z_t = z_t.unsqueeze(2)
        Sp = torch.cat([z_t, S_tm1[:, :, :-1]], 2)

        # update S
        u = self.N_u(Sp.view(Sp.size(0), -1))
        u[:, :idt.size(1)] = u[:, :idt.size(1)] + idt
        u = u.unsqueeze(2)
        S = torch.cat([u, S_tm1[:, :, :-1]], 2)

        return S

Thanks.

@macarbonneau
Copy link

I would like to express my interest regarding this question.
Also, while on the subject of the memory buffer update, I would like to know where is implemented the teacher forcing idea described in section 3.1 of the paper. I do not see where the predicted output is mixed with the noisy target.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants