Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch dim tag special handling can be problematic #920

Open
albertz opened this issue Jan 29, 2022 · 4 comments
Open

Batch dim tag special handling can be problematic #920

albertz opened this issue Jan 29, 2022 · 4 comments

Comments

@albertz
Copy link
Member

albertz commented Jan 29, 2022

For the contrastive loss implementation (#918), we flatten the masked encoder frames via FlattenBatchLayer and end up with B&Packed{'input_masked_frames:masked:time'} batch dim. For all those frames, we want to create a fixed number K=10 of candidate samples. So the natural way would be to use RandIntLayer and specify shape=[packed_batch_dim, samples_dim] with samples_dim = SpatialDim(..., K). However, that does not work because packed_batch_dim can not be different from the normal batch dim. This is by the current definition of equality of dim tags (#634).

The workaround in #918 is to first change the packed batch dim to a spatial dim via ReinterpretDataLayer, then RandIntLayer works, and later convert it back via ReinterpretDataLayer to a batch dim, and ReinterpretDataLayer got another new option batch_base where the batch dim is taken from. This is ugly obviously.

So I question whether our equality exception for the batch dim (#634) makes sense or maybe should be changed such that it behaves just as any other normal dim tag.

@albertz
Copy link
Member Author

albertz commented Jan 29, 2022

@Zettelkasten any thoughts on this?

@Zettelkasten
Copy link
Member

Zettelkasten commented Jan 29, 2022

Seems reasonable to me to treat different batch dims (e.g. with different packing format) different.
We need to be a bit careful that each Data instance only has at most one batch dim.
This also as a consequence means that get_common_data of two inputs with different batch dims would simply fail (currently, I think it would always work). Well, actually, maybe we could make it work always, and let copy_compatible_to also support a "non-packed to packed" conversion, and many other cases that would be needed.
That sounds messy and complicates things though, I would rather leave it out and make get_common_data throw an error in case different inputs have different batch dims.

@albertz
Copy link
Member Author

albertz commented Jan 30, 2022

One reason batch dim was kind of treated always as equal is that in the case it included also the beam, this logic of resolving the beam and making sure all inputs would end up in the same beam was handled separately, so this was a valid assumption.

However, when it contains other things (merge dim with some other dim), or is packed (flatten batch), this is not the case, and probably we also don't want that it automatically makes them equal.

@albertz
Copy link
Member Author

albertz commented Jan 31, 2022

Maybe the equality should exactly cover that: Ignore any contained beam but otherwise check BatchInfo equality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants