Fix doc example #16448

NielsRogge · 2022-03-28T12:20:35Z

What does this PR do?

This PR fixes the doc example of xxxForSequenceClassification models. I wonder how this test passes currently, cause for me it returned an error as the labels are of shape (batch_size, num_labels) but the problem_type wasn't set to "multi_label_classification".

HuggingFaceDocBuilderDev · 2022-03-28T12:34:37Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten · 2022-04-05T15:56:53Z

src/transformers/utils/doc.py

@@ -269,9 +269,10 @@ def _prepare_output_docstrings(output_type, config_class, min_indent=None):
    ```python
    >>> # To train a model on `num_labels` classes, you can pass `num_labels=num_labels` to `.from_pretrained(...)`
    >>> num_labels = len(model.config.id2label)
-    >>> model = {model_class}.from_pretrained("{checkpoint}", num_labels=num_labels)
+    >>> model = {model_class}.from_pretrained(


This I think we should revert. Why wouldn't this work?

@patrickvonplaten

I had a previous conversation with @NielsRogge on Slack.

He was using celine98/canine-s-finetuned-sst2, which has "problem_type": "single_label_classification", set in the config.

Due to setting, in the following block,

transformers/src/transformers/models/canine/modeling_canine.py

Lines 1322 to 1329 in d55fcbc

if labels is not None:

if self.config.problem_type is None:

if self.num_labels == 1:

self.config.problem_type = "regression"

elif self.num_labels > 1 and (labels.dtype == torch.long or labels.dtype == torch.int):

self.config.problem_type = "single_label_classification"

else:

self.config.problem_type = "multi_label_classification"

the block for the condition self.config.problem_type is None is not run, and it continue to be single_label_classification, therefore the output is not compatible with the provided labels (which is to work with multiple labels here).

For Roberta, cardiffnlp/twitter-roberta-base-emotion is used, and the problem_type is not set in the config. Therefore, the model code is able to set it to multi_label_classification.

Happy with "force-passing" single_label_classification as a flag here to overwrite default configs

You mean multi_label_classification? It's ok to keep this change?

I think it is in the reversed direction:

celine98/canine-s-finetuned-sst2 has config is set to single_label_classification)

but this block in PT_SEQUENCE_CLASSIFICATION_SAMPLE

transformers/src/transformers/utils/doc.py

Lines 269 to 282 in d77680e

```python

>>> # To train a model on `num_labels` classes, you can pass `num_labels=num_labels` to `.from_pretrained(...)`

>>> num_labels = len(model.config.id2label)

>>> model = {model_class}.from_pretrained(

... "{checkpoint}", num_labels=num_labels, problem_type="multi_label_classification"

... )

>>> labels = torch.nn.functional.one_hot(torch.tensor([predicted_class_id]), num_classes=num_labels).to(

... torch.float

... )

>>> loss = model(**inputs, labels=labels).loss

>>> loss.backward() # doctest: +IGNORE_RESULT

```

"""

is meant to be multi_label_classification. See labels = torch.nn.functional.one_hot.

That's why @NielsRogge needs to add problem_type="multi_label_classification" in the call to from_pretrained.

I will wait @NielsRogge joining this discussion, since he knows better the reason behind his change.

Well, he is faster than my response ...

Ah I see sorry yes you're right - ok to keep the change for me then!

Nono all good this makes perfect sense, I misunderstood here.

src/transformers/utils/doc.py

Fix doc

774a04a

NielsRogge requested review from patrickvonplaten and ydshieh March 28, 2022 12:20

Make fixup

d77680e

patrickvonplaten reviewed Apr 5, 2022

View reviewed changes

src/transformers/utils/doc.py Show resolved Hide resolved

patrickvonplaten approved these changes Apr 7, 2022

View reviewed changes

patrickvonplaten merged commit dc99180 into huggingface:main Apr 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix doc example #16448

Fix doc example #16448

NielsRogge commented Mar 28, 2022

HuggingFaceDocBuilderDev commented Mar 28, 2022 •

edited

Loading

patrickvonplaten Apr 5, 2022

ydshieh Apr 6, 2022

ydshieh Apr 6, 2022

patrickvonplaten Apr 7, 2022

NielsRogge Apr 7, 2022

ydshieh Apr 7, 2022

ydshieh Apr 7, 2022

patrickvonplaten Apr 7, 2022

patrickvonplaten Apr 7, 2022

	if labels is not None:
	if self.config.problem_type is None:
	if self.num_labels == 1:
	self.config.problem_type = "regression"
	elif self.num_labels > 1 and (labels.dtype == torch.long or labels.dtype == torch.int):
	self.config.problem_type = "single_label_classification"
	else:
	self.config.problem_type = "multi_label_classification"

	```python
	>>> # To train a model on `num_labels` classes, you can pass `num_labels=num_labels` to `.from_pretrained(...)`
	>>> num_labels = len(model.config.id2label)
	>>> model = {model_class}.from_pretrained(
	... "{checkpoint}", num_labels=num_labels, problem_type="multi_label_classification"
	... )

	>>> labels = torch.nn.functional.one_hot(torch.tensor([predicted_class_id]), num_classes=num_labels).to(
	... torch.float
	... )
	>>> loss = model(**inputs, labels=labels).loss
	>>> loss.backward() # doctest: +IGNORE_RESULT
	```
	"""

Fix doc example #16448

Fix doc example #16448

Conversation

NielsRogge commented Mar 28, 2022

What does this PR do?

HuggingFaceDocBuilderDev commented Mar 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Mar 28, 2022 •

edited

Loading