Accidentially used Encoder instead of Decoder? #10

DerEchteFeuerpfeil · 2024-02-12T15:07:15Z

Hey there, great project!

I have just one thing that I am not sure if I misunderstood or you did. You wanted to reproduce the architecture of IconShop, but to my understanding, IconShop is a decoder-only transformer (like a language model).
Excerpt from their paper:

I first thought you had a minor mistake in your blog post, but also there you wrote encoder-only:

An encoder-only autoregressive transformer is self-contradictory, isn't it?

Also here, you took the xFormerEncoderConfig instead of the (imo) correct xFormerDecoderConfig.

fontogen/model/model.py

Lines 201 to 227 in de1e50b

 x_config = xFormerEncoderConfig( 

 num_layers=num_layers, 

 dim_model=d_model, 

 feedforward_config={ 

 "name": "MLP", 

 "dropout": 0.1, 

 "activation": "gelu", 

 "hidden_layer_multiplier": 4, 

 }, 

 multi_head_config={ 

 "num_heads": nhead, 

 "attention": BlockSparseAttention( 

 num_heads=nhead, 

 layout=BigBirdSparsityConfig( 

 num_heads=nhead, 

 block_size=block_size, 

 num_global_blocks=1, 

 num_random_blocks=2, 

 num_sliding_window_blocks=int(3 * max_glyph_tokens / block_size) 

 ).make_layout(seq_len=self.seq_len), 

 block_size=block_size, 

 dropout=0.1, 

 causal=True, 

 ) 

 }, 

 ) 

 self.transformer = xFormer(x_config)

I haven't thought this through what the implications are. You correctly used causal attention, so there should not be issues on that front. There might not even be an issue at all, as decoders (without cross-attention) are kinda just encoders with causal attention masking, but I don't know the xformers library and its inner workings. Just wanted to set the mental model straight.

Thanks for releasing your code and project!

The text was updated successfully, but these errors were encountered:

SerCeMan · 2024-02-12T22:49:15Z

Hey, @DerEchteFeuerpfeil! The article should indeed be saying decoder-only. Thank you for this, I'll fix it up!

Implementation-wise, if I understand correctly, there is no other way to implement a decoder-only transformer in xFormers, at least until this lands facebookresearch/xformers#617.

DerEchteFeuerpfeil · 2024-02-12T22:56:56Z

Alright, makes sense! Does not speak for the library that a decoder-only architecture issue is open for >1y 😅

When re-implementing parts of this for my project I used x_transformers by lucidrains instead.

I'll close this issue for now, as it has served its purpose. Cheers!

DerEchteFeuerpfeil closed this as completed Feb 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accidentially used Encoder instead of Decoder? #10

Accidentially used Encoder instead of Decoder? #10

DerEchteFeuerpfeil commented Feb 12, 2024

SerCeMan commented Feb 12, 2024 •

edited

Loading

DerEchteFeuerpfeil commented Feb 12, 2024

Accidentially used Encoder instead of Decoder? #10

Accidentially used Encoder instead of Decoder? #10

Comments

DerEchteFeuerpfeil commented Feb 12, 2024

SerCeMan commented Feb 12, 2024 • edited Loading

DerEchteFeuerpfeil commented Feb 12, 2024

SerCeMan commented Feb 12, 2024 •

edited

Loading