Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rnnt_loss() gives different outputs (increasing) executed on same input #72

Open
stefan-falk opened this issue Jul 10, 2020 · 1 comment

Comments

@stefan-falk
Copy link

I am using https://github.com/HawkAaron/warp-transducer indirectly via https://github.com/noahchalifour/rnnt-speech-recognition and I have noticed something odd when running the rnnt_loss() function in eager mode on the same input over and over again.

Basically: Running rnnt_loss(*args) repeatedly where args is always the same input results in an increasing loss.

from warprnnt_tensorflow import rnnt_loss
import numpy as np


def main():
    acts = np.asarray([
        [
            [[0.0, 0.0, 0.0],
             [0.0, 0.0, 0.0]],
            [[0.0, 0.0, 0.0],
             [0.0, 0.0, 0.0]],
            [[0.0, 0.0, 0.0],
             [0.0, 0.0, 0.0]],
        ]
    ])

    labels = np.asarray([[1, 2, 0]])
    label_lengths = [len(t) for t in labels]

    for i in range(10):
        loss = rnnt_loss(
            acts=acts,
            labels=labels,
            input_lengths=label_lengths,
            label_lengths=label_lengths
        )
        print(np.mean(loss))


if __name__ == '__main__':
    main()

Output:

1.0986123
2.1490226
5.274593
6.7222075
9.581686
11.274273
13.95323
15.808798
18.36151
20.329256

Is this expected behavior or am I doing something wrong here?

See also noahchalifour/rnnt-speech-recognition#36

@cynecx
Copy link

cynecx commented Sep 17, 2020

I think the issue here is that the input is malformed. The label_lengths is [3] so U should be at least 3, hence the third dimension should be U+1=4 sized (So you could either adjust the label_length or adjust the 4-d acts tensor).

The validation of the input tensors is quite incomplete here which can cause Undefined Behavior and memory corruption as you could reproduce with your code. In my case it segfaults because of invalid writes (classic buffer overflow) which causes a corruption in the internal heap management structures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants