-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Check model files when cached model folder exists #602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
One thing is that if the fix is correct, I didn't explicitly delete the incomplete existing folder. I don't know if the downloading will cover the existing folder. So please let me know if such an action is needed. |
Anyone willing to review? Thanks! |
Sorry for the delay Jin. Thank you for taking a look into this issue. Instead checking for presence of specific files, we could just check whether the directory is empty or not: if tf_v1.gfile.Exists(module_dir) and tf_v1.gfile.ListDirectory(module_dir): Also even with the check adjusted it looks Rename() below would fail with "already exists", so we probably should also add a removal of module_dir if it's empty there: if not tf_v1.gfile.ListDirectory(module_dir): Let's also add tests to cover the case of trying to download into an empty directory. |
@akhorlin Thanks for the detailed reply! I will modify accordingly. One question is that if it's possible for the system to delete some files (e.g. the model files, due to large size) but leave some tiny files (metadata) in a |
I haven't seen such a case of partial deletion but I haven't seen the case
of empty model_dir either.
If we want to handle partial deletion, it would require a more
fundamental change to the caching protocol. One option is to piggyback on
the descriptor file. It gets generated after the directory has been
downloaded. Current algorithm (as the comment states) doesn't rely on the
presence of this file. We could look into adjusting the protocol to
re-download the directory if the descriptor file is missing.
…On Fri, Jun 5, 2020 at 5:44 PM Jin Dong ***@***.***> wrote:
@akhorlin <https://github.com/akhorlin> Thanks for the detailed reply! I
will modify accordingly.
One question is that if it's possible for the system to delete some files
(e.g. the model files, due to large size) but leave some tiny files
(metadata) in a /tmp folder. If this is not an issue, then your solution
will work great.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#602 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AG52OTBHHJQQ3RPMXCIQW3TRVEHE5ANCNFSM4NNXQQPA>
.
|
okay, maybe we can check if the folder is empty first and add test accordingly. I think this should solve most of the corner cases encountered by me and others in the issue mentioned above. |
If it's easy to reproduce on your end, it would be worthwhile to double
check that the directory is indeed empty. The module directory contains
subdirectories sometimes (e.g. /assets or /variables). So I am not sure
whether these sub-directories stay around or not on your system.
I am not sure how to reproduce the issue on my end since my /tmp gets
cleaned up completely after the restart.
…On Fri, Jun 5, 2020 at 5:55 PM Jin Dong ***@***.***> wrote:
okay, maybe we can check if the folder is empty first and add test
accordingly. I think this should solve most of the corner cases encountered
by me and others in the issue mentioned above.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#602 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AG52OTEJFW2WAB7V5IOWMV3RVEINXANCNFSM4NNXQQPA>
.
|
I did check it when it issue happened. The folder is indeed empty at that time. |
@akhorlin I changed the checking logic as mentioned above. Can you help me on adding a test for this case? I looked into the |
For testing, I think we can follow the pattern in resolver_test.py. They are indeed not e2e tests but they can exercise the logic in question. So the test could look something like:
We do have internal e2e tests that use the handle directly. So we could add something there as well once the changes are in. |
@akhorlin I added a test. Please check : ) |
tensorflow_hub/resolver.py
Outdated
@@ -380,9 +380,12 @@ def atomic_download(handle, | |||
overwrite=False) | |||
# Must test condition again, since another process could have created | |||
# the module and deleted the old lock file since last test. | |||
if tf_v1.gfile.Exists(module_dir): | |||
if tf_v1.gfile.Exists(module_dir) and \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line continuations with \ are not allowed in Google's py style-guide, lets do
if (tf_v1.gfile.Exists(module_dir) and
tf_v1.gfile.ListDirectory(module_dir)):
...
tensorflow_hub/resolver_test.py
Outdated
os.path.join(module_dir, "file"), "content", False) | ||
|
||
# Delete existing folder and create an empty one. | ||
if tf_v1.gfile.Exists(module_dir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That shouldn't be the case sinec get_temp_dir() should return a unique directory. May be just self.assertFalse(tf_v1.gfile.Exists(...))
Overall looks good. Thank you for looking into this issue. I added a few minor comments, once resolved I can integrate the change into our repo. |
I fixed the comments. Please check : ) |
Looks good. I will work on integrating this change early next week. |
The change has been integrated in c0610ab. Thank you for your contribution! |
When loading a module with a TF-Hub link. The resolver only checks if the model folder exists but not checks if the model files exist inside the folder. The situation happens quite possible, since usually the module is cached in /tmp. So it's likely that the system only delete model files but keep the model folder, which causes errors like in #575.