Issues about OneHotVector/OneHotMatrix #1445

chengchingwen · 2020-12-28T07:32:26Z

Mentioned in #1431

There are some issues about the one hot implementation. I'll list them below.

Higher dimension support

Currently Flux only support 1 and 2 dimensional one hot array. However, we often need more than that. For example, Transformers usually require one hot array with shape (num label, sequence length, batch size) for parallelization. Beside Transformers, image task that do pixel level classification, like semantics segmentation, also need that.

Array interface

There're some array operation that would not only convert one hot array to Boolean array but also copy data back to cpu. For example, if you hcat two gpu OneHotMatrix, the result will be Array{Bool}. But the correct result should be OneHotMatrix{CuArray{OneHotVector,1}}.

Memory consumption

We only support encoded numbers of labels up to 2^32 (max size of UInt32), but we use 64bit for OneHotVector. The problem appears from the use of OneHotMatrix, which is actually a container for Vector{OneHotVector}. Thus we actually use twice the memory than actually needed for storing the redundant number of the label size.

(4. There used to be some problem when using OneHotVector with custom CUDA kernel, but most of the seems to be gone after the CUDA update. I list this here just in case anyone encountered similar problems)

The text was updated successfully, but these errors were encountered:

CarloLucibello · 2020-12-28T07:56:15Z

Do you have an alternative implementation in Transformers.jl fixing these issues? Can it be retrofitted here, possibly without breaking the interface?

chengchingwen · 2020-12-28T08:24:56Z

I do have one. I think it won't be too hard to build the same interface above it.

CarloLucibello · 2020-12-28T12:35:50Z

great, would you file a PR whenever you have time? Actually, most of that Embeddings module and the MultiHeadAttention layer should be cannibalized by Flux (If you agree with this) and made available for general use outside of Transformers.jl as requested by many in #1431 (comment). OneHot arrays seem a well-isolated piece of functionality to start with

chengchingwen · 2020-12-28T12:59:28Z

Sure, I'll probably do it around the new year.

Actually, most of that Embeddings module and the MultiHeadAttention layer should be cannibalized by Flux

I'm ok with it. I can make a general version of MultiHeadAttention for Flux. On the other hand, which Embeddings module are we talking about?

CarloLucibello · 2020-12-28T13:17:51Z

On the other hand, which Embeddings module are we talking about?

The Embeds.jl in your repo. Actually, maybe that's too much, I don't know, I'm not an expert on these things and I don't know how popular they are.
Maybe just import here something similar to what pytorch has?
https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html#torch.nn.Embedding
https://pytorch.org/docs/stable/generated/torch.nn.EmbeddingBag.html#torch.nn.EmbeddingBag

chengchingwen · 2020-12-28T13:37:55Z

I remember that we used to have an Embed layer in Flux. Not sure why it's gone. I think moving the entire Embedding module in Transformers to Flux would be too much, but a basic Embed layer definition should be fine.

DhairyaLGandhi · 2020-12-31T15:12:05Z

Definitely agree on the embedding layer. Someone mentioned it was non-trivial to have a GPU compliant Embedding layer (@darsnack ?) and this would totally need to be on the list

darsnack · 2020-12-31T15:13:29Z

No I think I had said that w.r.t. upsampling layers.

chengchingwen · 2021-01-01T14:14:57Z

I think the GPU compliant Embedding layer would be easy once we have the gather/scatter support in NNlib

chengchingwen mentioned this issue Dec 31, 2020

new onehot implementation #1447

Closed

4 tasks

darsnack mentioned this issue Jan 1, 2021

Arbitrary dimension one-hot arrays #1448

Merged

4 tasks

bors bot closed this as completed in ebd37d6 Jan 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues about OneHotVector/OneHotMatrix #1445

Issues about OneHotVector/OneHotMatrix #1445

chengchingwen commented Dec 28, 2020

CarloLucibello commented Dec 28, 2020

chengchingwen commented Dec 28, 2020

CarloLucibello commented Dec 28, 2020

chengchingwen commented Dec 28, 2020

CarloLucibello commented Dec 28, 2020

chengchingwen commented Dec 28, 2020

DhairyaLGandhi commented Dec 31, 2020

darsnack commented Dec 31, 2020

chengchingwen commented Jan 1, 2021

Issues about OneHotVector/OneHotMatrix #1445

Issues about OneHotVector/OneHotMatrix #1445

Comments

chengchingwen commented Dec 28, 2020

CarloLucibello commented Dec 28, 2020

chengchingwen commented Dec 28, 2020

CarloLucibello commented Dec 28, 2020

chengchingwen commented Dec 28, 2020

CarloLucibello commented Dec 28, 2020

chengchingwen commented Dec 28, 2020

DhairyaLGandhi commented Dec 31, 2020

darsnack commented Dec 31, 2020

chengchingwen commented Jan 1, 2021