`metadata_update()` duplicates metrics when updating a single field from `None` #1210

lewtun · 2022-11-20T20:13:42Z

Describe the bug

When updating evaluation metadata with the metadata_update() function, it seems that updating a field value from None to some other value results in the metrics being duplicated.

See this Hub PR for an example, where I've simply changed the value of verify_token from None to 1234. One sees that e.g. accuracy is duplicated to include a new "evaluation" element where the verify_token field is not None.

Intuitively, I would expect that updating an attribute of EvalResult would simply update in-place.

Reproduction

Here's the code snippet the produced the above PR (feel free to run it without fear :)):

from huggingface_hub import model_info, metadata_update
from huggingface_hub.repocard_data import model_index_to_eval_results, eval_results_to_model_index

model_id = "autoevaluate/binary-classification"
metadata = model_info(model_id)
model_index = metadata.cardData["model-index"]
_, eval_results = model_index_to_eval_results(model_index)

# Change verify_token from `None` to dummy value
for eval_result in eval_results:
    if eval_result.verified is True:
        eval_result.verify_token = "1234"
        
new_metadata = {"model-index": eval_results_to_model_index(model_id, eval_results)}        
metadata_update(model_id, metadata=new_metadata, overwrite=True, create_pr=True, commit_message="Test update")

Logs

No response

System info

- huggingface_hub version: 0.11.0
- Platform: Linux-4.19.0-22-cloud-amd64-x86_64-with-glibc2.10
- Python version: 3.8.13
- Running in iPython ?: Yes
- iPython shell: ZMQInteractiveShell
- Running in notebook ?: Yes
- Running in Google Colab ?: No
- Token path ?: /home/lewis_huggingface_co/.huggingface/token
- Has saved token ?: True
- Who am I ?: lewtun
- Configured git credential helpers: store
- FastAI: N/A
- Tensorflow: N/A
- Torch: 1.11.0
- Jinja2: 3.1.2
- Graphviz: N/A
- Pydot: N/A

The text was updated successfully, but these errors were encountered:

lewtun · 2022-11-20T21:16:47Z

Ah, I think I know what causes this: in this line we check if an eval is equal for every field except metric_value. Since verify_token differs in the above example, the updated eval is treated as distinct and thus appended.

For context: verify_token will change whenever metric_value changes, so being able to update this value as well makes sense IMO. We currently have a lot of "verified" evaluations where verify_token is None (because it didn't exist when we implemented the evaluation service) and we need a way to avoid duplicating evaluations on each update.

Perhaps a better approach would be to implement something more generic like

class EvalResult:
    ...
    def is_equal_except_attribute(self, other, attribute="metric_value"):
        """
        Return True if `self` and `other` describe exactly the same attribute but with a
        different value.
        """
        for key, _ in self.__dict__.items():
            if key == attribute:
                continue
            if getattr(self, key) != getattr(other, key):
                return False
        return True

Then in metadata_update() we can have two if-statements: one to check metric_value, the other to check verify_token

Edit: on second thought, I don't think we need this extension. Since verify_token will be updated whenever metric_value is (it's basically a hash of the metrics), we can achieve this by updating verify_token in the same if-statement we currently use in metadata_update(). I'll open a PR :)

I think the logic is a bit more complex than I anticipated. Here's a few conditions we need to satisfy:

If metric is self-reported and only metric_value changes => update metric_value
If metric is from evaluation service and metric_value unchanged but verify_token is None => update verify_token
If metric is from evaluation service and metric_value changes => update both metric_value and verify_token (since they're coupled)

lewtun added the bug Something isn't working label Nov 20, 2022

lewtun mentioned this issue Nov 21, 2022

Fix metadata_update for verified evaluations #1214

Merged

1 task

lewtun closed this as completed in #1214 Nov 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`metadata_update()` duplicates metrics when updating a single field from `None` #1210

`metadata_update()` duplicates metrics when updating a single field from `None` #1210

lewtun commented Nov 20, 2022

lewtun commented Nov 20, 2022 •

edited

Loading

metadata_update() duplicates metrics when updating a single field from None #1210

metadata_update() duplicates metrics when updating a single field from None #1210

Comments

lewtun commented Nov 20, 2022

Describe the bug

Reproduction

Logs

System info

lewtun commented Nov 20, 2022 • edited Loading

`metadata_update()` duplicates metrics when updating a single field from `None` #1210

`metadata_update()` duplicates metrics when updating a single field from `None` #1210

lewtun commented Nov 20, 2022 •

edited

Loading