You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
tokenizer.convert_id_to_tokens(input_ids)):
i am looking for a restaurant in the [restaurant_area] . postcode type phone food [UNK] address area name id reference.
The Tokenizer takes the token ‘pricerange’ as ‘[UNK]’, so the training code might not work.
Is it normal?Does the source code has something incorrect?
I try to examine this issue by:
tokenizer = Tokenizer(vocab, ivocab, False)
print(tokenizer.vocab_len) # 3130
print(tokenizer.get_word_id('pricerange')) # 3
print(tokenizer.get_word(3)) # [UNK]
The text was updated successfully, but these errors were encountered:
For example:
tokens:
['i', 'am', 'looking', 'for', 'a', 'restaurant', 'in', 'the', '[restaurant_area]', '.', 'postcode', 'type', 'phone', 'food', 'pricerange', 'address', 'area', 'name', 'id', 'reference']
input_ids:
[8, 35, 51, 15, 12, 45, 18, 9, 67, 6, 89, 117, 68, 88, 3, 82, 70, 346, 281, 49, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
tokenizer.convert_id_to_tokens(input_ids)):
i am looking for a restaurant in the [restaurant_area] . postcode type phone food [UNK] address area name id reference.
The Tokenizer takes the token ‘pricerange’ as ‘[UNK]’, so the training code might not work.
Is it normal?Does the source code has something incorrect?
I try to examine this issue by:
tokenizer = Tokenizer(vocab, ivocab, False)
print(tokenizer.vocab_len) # 3130
print(tokenizer.get_word_id('pricerange')) # 3
print(tokenizer.get_word(3)) # [UNK]
The text was updated successfully, but these errors were encountered: