Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add md5 check for downloaded JGI file #9

Merged
merged 1 commit into from
Aug 11, 2021
Merged

Conversation

orangeSi
Copy link
Contributor

No description provided.

@orangeSi
Copy link
Contributor Author

Sometime curl exit with code 0 after finished download file, but the file in fact is not intact, maybe the network problem cause this. So validated the completeness of downloaded file by md5 value maybe needed, that is why this pull request is here.

@orangeSi orangeSi changed the title add md5 check for download JGI file add md5 check for downloaded JGI file Aug 11, 2021
@glarue glarue changed the base branch from master to md5 August 11, 2021 16:11
@@ -577,32 +582,50 @@ def is_broken(filename, min_size_bytes=20):
else:
return False

def check_md5(md5, filename):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to replace this with a native Python implementation in the interest of cross-platform compatibility.

Copy link
Owner

@glarue glarue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will make a few changes on the md5 branch and then merge into master.

@@ -219,7 +219,7 @@ def get_file_list(xml_file, filter_categories=False):
except KeyError:
continue
uid += 1

#print(f"descriptors={descriptors}") ##myth
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't leave commented-out code in the PR.

Suggested change
#print(f"descriptors={descriptors}") ##myth

@@ -391,6 +393,9 @@ def print_data(data, org_name, display=True):
print_list.append("{}:".format(sub_cat))
for index, i in sorted(items.items()):
dict_to_get[catID][index] = i["url"]
url_to_md5[i["url"]] = i["md5"]
#print(f"md5={i['md5']}, url={i['url']}, catID={catID}, index={index}\n")#myth
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented-out code

Suggested change
#print(f"md5={i['md5']}, url={i['url']}, catID={catID}, index={index}\n")#myth

@glarue glarue merged commit 900f61b into glarue:md5 Aug 11, 2021
@glarue
Copy link
Owner

glarue commented Aug 11, 2021

Commit d89dc10 integrates my implementation of the changes initialized with this PR.

@glarue
Copy link
Owner

glarue commented Aug 11, 2021

Sometime curl exit with code 0 after finished download file, but the file in fact is not intact, maybe the network problem cause this. So validated the completeness of downloaded file by md5 value maybe needed, that is why this pull request is here.

I didn't say before, but this is a good feature I've been meaning to add for some time. Thanks for your initial work setting it up—I modified it slightly to match the style/approach of the script overall, but the general design was good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants