transformers rather than pytorch_pretrained_bert #28

cmacdonald · 2020-05-07T18:02:42Z

We'd like to make use of the more generic transformers library. There is some migration information at https://huggingface.co/transformers/migration.html

We're trying to upgrade a BertRanker:

class VanillaBertTransformerRanker(BertRanker):
    def __init__(self):
        super().__init__()

        self.tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
        self.bert = AutoModel.from_pretrained('bert-base-uncased')

        self.dropout = torch.nn.Dropout(0.1)
        self.cls = torch.nn.Linear(self.BERT_SIZE, 1)
<snip>

        for layer in result:
            cls_output = layer[:, 0]
            cls_result = []
            for i in range(cls_output.shape[0] // BATCH):
                cls_result.append(cls_output[i*BATCH:(i+1)*BATCH])

            cls_result = torch.stack(cls_result, dim=2).mean(dim=2)
            cls_results.append(cls_result)

We needed to do some casting in train.py, e.g.:

torch.tensor(record['query_tok']).to(torch.int64)

However, the shapes get a bit out of kilter in encode_bert: before the stack, we end up with size [4] rather than [4,768]

# vanilla_bert
# shape of first tensor: torch.Size([283, 768])
# shape of first result: torch.Size([8, 283, 768])
# shape of first cls_result: torch.Size([4, 768])
# shape of first cls_result before stack: torch.Size([4, 768])
# shape of cls_result before stack: 2
# shape of first cls_result after stack: torch.Size([768])
# shape of cls_result after stack: 4
# shape of first cls_result: torch.Size([4, 768])
# shape of first cls_result before stack: torch.Size([4, 768])
# shape of cls_result before stack: 2
# ...
    
# transformer bert
# shape of first tensor: torch.Size([283, 768])
# shape of first result: torch.Size([8, 283, 768])
# shape of first cls_result before stack: torch.Size([4, 768])
# shape of cls_result before stack: 2
# shape of first cls_result after stack: torch.Size([768])
# shape of cls_result after stack: 4
# shape of first cls_result before stack: torch.Size([4])
# shape of cls_result before stack: 2
# ---> error

+@albertoueda

The text was updated successfully, but these errors were encountered:

petulla · 2020-08-06T02:44:13Z

@cmacdonald Did you migrate to HuggingFace? I'm interested in trying your pretrained model once it's released.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transformers rather than pytorch_pretrained_bert #28

transformers rather than pytorch_pretrained_bert #28

cmacdonald commented May 7, 2020 •

edited

Loading

petulla commented Aug 6, 2020

transformers rather than pytorch_pretrained_bert #28

transformers rather than pytorch_pretrained_bert #28

Comments

cmacdonald commented May 7, 2020 • edited Loading

petulla commented Aug 6, 2020

cmacdonald commented May 7, 2020 •

edited

Loading