Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SliceNdLayer added set_tag_on_size_tensor for dynamic size case #661

Merged
merged 3 commits into from
Sep 17, 2021

Conversation

robin-p-schmitt
Copy link
Contributor

@robin-p-schmitt robin-p-schmitt commented Sep 17, 2021

If we do not do the set_tag_on_size_tensor, then the "_is_size_of_dim_tag" attribute of the tag will not be set and this can lead to an error in follow-up layers:

File "/mnt/projects/i6/returnn/returnn/tf/util/data.py", line 342, in DimensionTag.can_compare
    line: assert self.get_tag_from_size_tensor(self.dyn_size).get_same_base() is self
    locals:
      self = <local> DimensionTag{'sliced-time:segments'[B,'time'[B]]}
      self.get_tag_from_size_tensor = <local> <bound method DimensionTag.get_tag_from_size_tensor of <class 'returnn.tf.util.data.DimensionTag'>>
      self.dyn_size = <local> <tf.Tensor 'output/rec/segments/Maximum_1:0' shape=(?, ?) dtype=int32>
      get_same_base = <not found>
AttributeError: 'NoneType' object has no attribute 'get_same_base'

@albertz
Copy link
Member

albertz commented Sep 17, 2021

Yea looks good. Why is this a draft? Is this still incomplete?

You again might want to add a simple test case.

@robin-p-schmitt
Copy link
Contributor Author

Yea looks good. Why is this a draft? Is this still incomplete?

Because of the other error I described in my comment. Or should I open an issue for that instead?

@albertz
Copy link
Member

albertz commented Sep 17, 2021

Yea looks good. Why is this a draft? Is this still incomplete?

Because of the other error I described in my comment. Or should I open an issue for that instead?

Is that a separate issue or not? This sounds very much like a totally independent problem (in MergeDimsLayer). So then it should be separate. PRs should always be as minimal as possible (or as it make sense).

@robin-p-schmitt
Copy link
Contributor Author

Is that a separate issue or not?

It's separate, I'll open an issue for that.

You again might want to add a simple test case.

What kind of test? The behavior did not really change. Or should I create a simple test where I check whether the "_is_size_of_dim_tag" attribute is set?

@albertz
Copy link
Member

albertz commented Sep 17, 2021

Or should I create a simple test where I check whether the "_is_size_of_dim_tag" attribute is set?

No.

What kind of test? The behavior did not really change.

You said it can lead to an error. So you could write a test which leads to this error. And as I understand you, your fix resolves the error then.

@robin-p-schmitt
Copy link
Contributor Author

You said it can lead to an error. So you could write a test which leads to this error. And as I understand you, your fix resolves the error then.

The fact that the "_is_size_of_dim_tag" attribute was not set before led to the error. My fix only sets this attribute. So I could write a test which checks whether the attribute is set to test the behavior.

@albertz
Copy link
Member

albertz commented Sep 17, 2021

The fact that the "_is_size_of_dim_tag" attribute was not set before led to the error. My fix only sets this attribute. So I could write a test which checks whether the attribute is set to test the behavior.

No, the test should not check for that.

I don't understand. You get an error. You can write a test case which reproduces exactly the error.

Then you fix it (via slice_tag.set_tag_on_size_tensor(dyn_size)).
So the test case should now work without error?

@robin-p-schmitt
Copy link
Contributor Author

I don't understand. You get an error. You can write a test case which reproduces exactly the error.

Then you fix it (via slice_tag.set_tag_on_size_tensor(dyn_size)).
So the test case should now work without error?

Oh okay, so the test should fail before my commit and it should succeed after my commit? I can do that. What would be the appropriate test file for that? test_TFNetworkLayer.py?

@albertz
Copy link
Member

albertz commented Sep 17, 2021

Oh okay, so the test should fail before my commit and it should succeed after my commit?

Yes exactly. Like basically most tests. Esp via PRs which fix certain issues.

What would be the appropriate test file for that? test_TFNetworkLayer.py?

Yes. Just next/below the other slice-nd-layer or related tests.

But keep the test really simple, minimal (short) and fast.

@albertz
Copy link
Member

albertz commented Sep 17, 2021

In general, it's maybe not always needed to write a test. Maybe also not here. Although it also doesn't hurt.

For more complex issues where the origin of the issue is not really clear, this is almost always helpful. This is basically my workflow. First have a reduced test case which reproduces the problem. This is usually also much faster and easier to run and to debug. Then debug it and try to understand it better. Maybe reduce it even further. Maybe write other further test cases to test for specific other things. Until I finally understand the issue. Then I try to fix it. Once the test runs through, this is mostly done.

And test cases are also helpful such that this will not break accidentally by some other change in the future. Because we always make sure that all tests are passing.

@robin-p-schmitt
Copy link
Contributor Author

robin-p-schmitt commented Sep 17, 2021

This is the simplest test I could come up with right now in order to reproduce my error. It seems to be caused if CompareLayer gets two inputs of which one is the SliceNdLayer output. It then tries to find the common axes of the inputs and, while doing so, it calls get_tag_from_size_tensor.

@albertz
Copy link
Member

albertz commented Sep 17, 2021

This has to be inside a rec layer to be triggered?

@albertz
Copy link
Member

albertz commented Sep 17, 2021

Btw, again, why is this a draft? Is this still incomplete?

@robin-p-schmitt robin-p-schmitt marked this pull request as ready for review September 17, 2021 11:39
@robin-p-schmitt robin-p-schmitt requested review from albertz and a team as code owners September 17, 2021 11:39
@robin-p-schmitt
Copy link
Contributor Author

robin-p-schmitt commented Sep 17, 2021

This has to be inside a rec layer to be triggered?

Yes, I first tried doing it outside the rec layer and there were no problems.

Actually: let me try again

@albertz
Copy link
Member

albertz commented Sep 17, 2021

This has to be inside a rec layer to be triggered?

Yes, I first tried doing it outside the rec layer and there were no problems.

Actually: let me try again

So the simpler test is also valid now? I wondered why the rec layer was needed. This did not make sense to me.

@robin-p-schmitt
Copy link
Contributor Author

So the simpler test is also valid now? I wondered why the rec layer was needed. This did not make sense to me.

Yes, the simpler test is valid too. When I tried first without the rec layer, I thought the problem was somewhere else that's why I used the rec layer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants