-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plate list inconsistency #2
Comments
Thanks for reporting this. tl;dr - I recommend using the plate list reported in the paper and The numbers on the landing page of http://www.cellimagelibrary.org/pages/project_20269 are certainly misleading because they were later updated in GigaDB paper, so you can ignore that (I will try to request them to update). But all the 406 plates should nonetheless be available via the Cell Image Library download link (which is in Regarding |
Thanks for the reply. You can find This checksum file is very helpful to check if the downloaded metadata files are complete. It would be awesome if GigaDB can upload a similar checksum file for raw image zip files as well. Regarding |
Ah so those md5's correspond to the (per plate) processed data – is that what you are referring to as metadata files? I can look up why only 349 show up The md5's for the images would need to be provided by the Cell Image Library. I can request them but can't guarantee. |
Yes, those md5's correspond to the processed data. |
Got it Do the md5's match up for the 349 that are available (except for Plate_25575, although the issue there seems to be different - md5 is correct but cannot extract, right?) |
Yes, the downloaded files having the same md5 fingerprint as shown in If a downloaded file has a different fingerprint, then it is likely to be corrupted during the download process. We just try to download it again. |
md5_of_tar_gz_files.txt Meanwhile, I'll try to figure out 25575 |
Plate_25575.tar.gz md5 should be fc10288f8826d8d15a73edbbc0e6b214, as listed in #2 (comment) The one at http://gigadb.org/dataset/view/id/100351 is wrong (will fix) once you confirm that the rest are good |
LMK if you are able to download from https://s3.amazonaws.com/imaging-platform-collaborator/CDRP/Plate_25575.tar.gz |
Thanks for providing The md5 for the new |
Great - thanks for confirming For our records: Shantanu has emailed gigascience to sort this out |
Thank you so much, Shantanu. I will also appreciate it if you can request a similar md5sum file from Cell Image Library. |
Hi, |
Thanks @only1chunts |
To close the loop on this, I just verified that #2 (comment) has been addressed. Thanks @only1chunts ! |
Hello, thanks so much for providing a script to download all images.
It seems the plate lists from different sources are quite different. In the paper and
download_cil_images.sh
, there are 406 plates. However, Cell Image Library writes there are 375 plates but only gives 373 links. It also lists many20***
plates which are not listed indownload_cil_images.sh
. Giga DB also writes 406 plates, but the providedmd5sum.txt
has only 349 entries. I also notice thatdownload_cil_images.sh
has been edited, and Giga DB mentions 7 excluded plates.@shntnu Which plate list do you recommend to use for analysis please?
The text was updated successfully, but these errors were encountered: