You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For my token classification task, the predicted labels from the explainer differ from those obtained using the pipeline. Some tokens that should have a label are instead predicted as 'O' by the explainer. Even after setting ignored_labels=['O'], these tokens are still included in the visualization (with only the true 'O' tokens being excluded) and continue to be displayed as 'O' in the visual.
pipeline prediction :
[{'end': 50,
'entity_group': 'OrderAndDelivery',
'score': 0.91371477,
'start': 4,
'word': 'colis a été marqué comme livré alors que je ne'},
{'end': 62,
'entity_group': 'OrderAndDelivery',
'score': 0.6080048,
'start': 56,
'word': 'jamais'}]
Explainer :
Has anyone else experienced this issue or found a solution?
Below is the code:
config = AutoConfig.from_pretrained('models/ner_model_camembert_v7')
max_length=120
model = AutoModelForTokenClassification.from_pretrained('models/ner_model_camembert_v7', config=config)
tokenizer = AutoTokenizer.from_pretrained('models/ner_model_camembert_v7', config=config, truncation=True, return_offsets_mapping=True, padding="max_length", max_length=max_length)
model.eval()
ner_explainer = TokenClassificationExplainer(
model,
tokenizer
)
sample_text = "Mon colis a été marqué comme livré alors que je ne l ai jamais reçu"
word_attributions = ner_explainer(sample_text, ignored_labels=['O'])
pipe = pipeline("token-classification", model=model, aggregation_strategy="simple", tokenizer=tokenizer)
output_model = pipe(sample_text)
pprint(output_model)
ner_explainer.visualize()
The text was updated successfully, but these errors were encountered:
For my token classification task, the predicted labels from the explainer differ from those obtained using the pipeline. Some tokens that should have a label are instead predicted as 'O' by the explainer. Even after setting
ignored_labels=['O']
, these tokens are still included in the visualization (with only the true 'O' tokens being excluded) and continue to be displayed as 'O' in the visual.pipeline prediction :
Explainer :
Has anyone else experienced this issue or found a solution?
Below is the code:
The text was updated successfully, but these errors were encountered: