Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for 'upload_to_hf_hub()' path mismatch with 'save()' #3977

Merged
merged 16 commits into from
Mar 24, 2024

Conversation

sanjaydasgupta
Copy link
Contributor

@sanjaydasgupta sanjaydasgupta commented Mar 22, 2024

The code in this PR provides a solution for this issue: upload_to_hf_hub model path mismatch with model.save while also remaining backward compatible with the previous behavior of upload_to_hf_hub.

With this change, upload_to_hf_hub can accept any of the following kinds of path as its model_path argument:

  1. a path to the experiment folder (existing implementation).
  2. a path to the model folder (added by this PR).

The names experiment and model refer to places in the folder hierarchy created when a model is trained:

results / experiment / model / model_weights

The model folder is additionally distinguished by containing the model_hyperparameters.json file. The model_weights folder contains one or more of these files: pytorch_model.bin, adapter_model.bin, adapter_model.safetensors.

The changes to upload_to_hf_hub() cause it to test the model_path argument to determine which kind of path was passed. If the path is a model folder, then the parent path (the corresponding experiment path) is obtained and used instead. So the logic of upload_to_hf_hub and its supporting functions do not have to change at all. It uses that knowledge for its own purposes and also passes it down to all supporting functions (as an additional argument model_path_is_experiment_path: bool = True) for their use. The default value of model_path_is_experiment_path has been chosen to ensure the backward compatible behavior.

A unit test has been added (test_upload_to_hf_hub__validate_upload_parameters2()) for the new use-case. And the feature has been integration tested with custom code on a local build.

Copy link

github-actions bot commented Mar 22, 2024

Unit Test Results

  6 files  ±0    6 suites  ±0   13m 40s ⏱️ -45s
12 tests ±0    7 ✔️  -   2    5 💤 +  2  0 ±0 
60 runs  ±0  30 ✔️  - 12  30 💤 +12  0 ±0 

Results for commit aa1750e. ± Comparison against base commit b2cb8b0.

This pull request skips 2 tests.
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[ames_housing.gbm.yaml]
tests.regression_tests.benchmark.test_model_performance ‑ test_performance[mercedes_benz_greener.gbm.yaml]

♻️ This comment has been updated with latest results.

@sanjaydasgupta
Copy link
Contributor Author

Hi @alexsherstinsky, please take a look. Thanks.

ludwig/api.py Outdated
The description of the generated commit

# Returns

:return: (bool) True for success, False for failure.
"""
if os.path.exists(os.path.join(model_path, "model", "model_weights")) and os.path.exists(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanjaydasgupta The idea/direction seems great! I just have a couple of minor suggestions for you to consider (because if they work, they might improve the code quality slightly):

  1. Can we create the constants for the two directory names, "model" and "model_weights" and replace codebase-wise?
  2. Is it possible instead of passing the boolean model_path_is_experiment_path, determine the "experiments" directory and standardize all methods on using it -- so the "experiment" directory would be passed to them. I think that even if a function is passed the "model" directory, and in those cases we have to make the assumption that "experiments" is "model/..", this would still be reasonable and acceptable. What do you think?

Thank you very much!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexsherstinsky, responding to your (2) above - sounds like a great idea! Let me take a closer look at the code.

Copy link
Contributor Author

@sanjaydasgupta sanjaydasgupta Mar 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alexsherstinsky, both your suggestions have been implemented.

There was already a label MODEL_WEIGHTS_FILE_NAME defined (and used in some places) for the literal string "model_weights". So the choice of a label for "model" fell naturally on MODEL_FILE_NAME -- not the best, but there it is for the time being.

ludwig/api.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@alexsherstinsky alexsherstinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanjaydasgupta This is great! I left some comments/suggestions to see if we can make it even better -- thank you!

Copy link
Collaborator

@alexsherstinsky alexsherstinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanjaydasgupta I worry that it may not be safe to change a global constant as it may adversely impact backward compatibility (as in -- a user might rely on it). Is there any way for us in the present PR to constrain the changes in such a way that we do not modify the existing constants but reuse them and define a new one, if needed? Thank you!

@sanjaydasgupta
Copy link
Contributor Author

sanjaydasgupta commented Mar 23, 2024

@sanjaydasgupta I worry that it may not be safe to change a global constant as it may adversely impact backward compatibility (as in -- a user might rely on it). Is there any way for us in the present PR to constrain the changes in such a way that we do not modify the existing constants but reuse them and define a new one, if needed? Thank you!

I think there is a misunderstanding, let me clarify what I meant:

The global constant MODEL_WEIGHTS_FILE_NAME (for the literal string “model_weights”) existed previously, and continues to be used to denote “model_weights”. I have not changed that. In fact I have replaced “model_weights” with the global constant in some more places in the code.

I added the (previously unused) global constant MODEL_FILE_NAME for the literal string “model”. Most occurrences of “model” in the code have been replaced with the global constant. A few instances of “model” remain — their context was different (e.g. repository type etc.)

My comment about their being poor choices was due to the use of “FILE” instead of “DIRECTORY” or “FOLDER”. Both constants refer to folders, not file-names.

Please let me know if I misunderstood your comment. Thanks.

Copy link
Collaborator

@alexsherstinsky alexsherstinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sanjaydasgupta You are correct (I just got concerned about a 38-module change!). Please sync with the latest "master" branch, and I will approve. Thank you very much!

@sanjaydasgupta
Copy link
Contributor Author

@sanjaydasgupta You are correct (I just got concerned about a 38-module change!). Please sync with the latest "master" branch, and I will approve. Thank you very much!

Hi @alexsherstinsky this PR is now ready for merge.

Copy link
Collaborator

@alexsherstinsky alexsherstinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Beautiful enhancement! Thank you very much, @sanjaydasgupta!

@alexsherstinsky alexsherstinsky merged commit 4b07ce4 into ludwig-ai:master Mar 24, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants