Skip to content

Commit

Permalink
Return tokens in roberta sequence labeling
Browse files Browse the repository at this point in the history
  • Loading branch information
arxyzan committed Aug 16, 2023
1 parent 330a91a commit 5a08577
Showing 1 changed file with 10 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,16 @@ def preprocess(self, inputs: Union[str, List[str]], **kwargs):
normalizer = self.preprocessor["text_normalizer"]
inputs = normalizer(inputs)
tokenizer = self.preprocessor[self.tokenizer_name]
inputs = tokenizer(inputs, return_tensors="pt", device=self.device)
inputs = tokenizer(
inputs,
return_word_ids=True,
return_tokens=True,
return_offsets_mapping=True,
padding=True,
truncation=True,
return_tensors="pt",
device=self.device,
)
return inputs

def post_process(self, inputs, **kwargs):
Expand Down

0 comments on commit 5a08577

Please sign in to comment.