Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metadata_update() duplicates metrics when updating a single field from None #1210

Closed
lewtun opened this issue Nov 20, 2022 · 1 comment · Fixed by #1214
Closed

metadata_update() duplicates metrics when updating a single field from None #1210

lewtun opened this issue Nov 20, 2022 · 1 comment · Fixed by #1214
Labels
bug Something isn't working

Comments

@lewtun
Copy link
Member

lewtun commented Nov 20, 2022

Describe the bug

When updating evaluation metadata with the metadata_update() function, it seems that updating a field value from None to some other value results in the metrics being duplicated.

See this Hub PR for an example, where I've simply changed the value of verify_token from None to 1234. One sees that e.g. accuracy is duplicated to include a new "evaluation" element where the verify_token field is not None.

Intuitively, I would expect that updating an attribute of EvalResult would simply update in-place.

Reproduction

Here's the code snippet the produced the above PR (feel free to run it without fear :)):

from huggingface_hub import model_info, metadata_update
from huggingface_hub.repocard_data import model_index_to_eval_results, eval_results_to_model_index

model_id = "autoevaluate/binary-classification"
metadata = model_info(model_id)
model_index = metadata.cardData["model-index"]
_, eval_results = model_index_to_eval_results(model_index)

# Change verify_token from `None` to dummy value
for eval_result in eval_results:
    if eval_result.verified is True:
        eval_result.verify_token = "1234"
        
new_metadata = {"model-index": eval_results_to_model_index(model_id, eval_results)}        
metadata_update(model_id, metadata=new_metadata, overwrite=True, create_pr=True, commit_message="Test update")

Logs

No response

System info

- huggingface_hub version: 0.11.0
- Platform: Linux-4.19.0-22-cloud-amd64-x86_64-with-glibc2.10
- Python version: 3.8.13
- Running in iPython ?: Yes
- iPython shell: ZMQInteractiveShell
- Running in notebook ?: Yes
- Running in Google Colab ?: No
- Token path ?: /home/lewis_huggingface_co/.huggingface/token
- Has saved token ?: True
- Who am I ?: lewtun
- Configured git credential helpers: store
- FastAI: N/A
- Tensorflow: N/A
- Torch: 1.11.0
- Jinja2: 3.1.2
- Graphviz: N/A
- Pydot: N/A
@lewtun lewtun added the bug Something isn't working label Nov 20, 2022
@lewtun
Copy link
Member Author

lewtun commented Nov 20, 2022

Ah, I think I know what causes this: in this line we check if an eval is equal for every field except metric_value. Since verify_token differs in the above example, the updated eval is treated as distinct and thus appended.

For context: verify_token will change whenever metric_value changes, so being able to update this value as well makes sense IMO. We currently have a lot of "verified" evaluations where verify_token is None (because it didn't exist when we implemented the evaluation service) and we need a way to avoid duplicating evaluations on each update.

Perhaps a better approach would be to implement something more generic like

class EvalResult:
    ...
    def is_equal_except_attribute(self, other, attribute="metric_value"):
        """
        Return True if `self` and `other` describe exactly the same attribute but with a
        different value.
        """
        for key, _ in self.__dict__.items():
            if key == attribute:
                continue
            if getattr(self, key) != getattr(other, key):
                return False
        return True

Then in metadata_update() we can have two if-statements: one to check metric_value, the other to check verify_token

Edit: on second thought, I don't think we need this extension. Since verify_token will be updated whenever metric_value is (it's basically a hash of the metrics), we can achieve this by updating verify_token in the same if-statement we currently use in metadata_update(). I'll open a PR :)

I think the logic is a bit more complex than I anticipated. Here's a few conditions we need to satisfy:

  1. If metric is self-reported and only metric_value changes => update metric_value
  2. If metric is from evaluation service and metric_value unchanged but verify_token is None => update verify_token
  3. If metric is from evaluation service and metric_value changes => update both metric_value and verify_token (since they're coupled)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant