Resizing HF token embeddings with PipelineModule

With a HF model class, one can resize token embeddings to account for any special tokens, there's no upper limit, i.e.,
in the usual scenario (this isn't necessarily working code, I may have gotten the tokenizer APIs incorrect):

```
from transformers import GPT2Config, GPT2LMHeadModel, GPT2Tokenizer
config_class = GPT2Config
model_class = GPT2LMHeadModel
tokenizer_class = GPT2Tokenizer

config = config_class.from_pretrained("gpt2-xl")   # let's say we want to use the XL config for now, has its own vocab size
tokenizer = tokenizer_class.from_pretrained("gpt2-xl")  # default XL vocab

tokenizer.add_special_tokens("<speaker1>")
tokenizer.add_special_tokens("<speaker2>")

model = model_class(config)
model.resize_token_embeddings(len(tokenizer))
```

The last line essentially allocates 2 new indices for the newly added special tokens in the input embeddings matrix, and initializes their embeddings with random weights.

Now in the pipeline regime, one cannot just resize the token embeddings after initialization of the `PipelineModule`, since the module would have already split the model across pipeline stages. Is it possible to provide a callback/mechanism with `PipelineModule` that can allow for resizing and fresh initialization of newly added special token embeddings for downstream users?

Also, shouldn't this be a problem with the implementation of pipeline (and more generally 3D) parallelism in the `DeepSpeedExamples` repo too? A user of a model that's been pre-trained with pipeline parallelism would certainly have some basic downstream needs such as addition of special tokens for fine-tuning.

@ShadenSmith @stas00 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Resizing HF token embeddings with PipelineModule #1010

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Resizing HF token embeddings with PipelineModule #1010

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions