-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(listing): pick actually the smallest one to update #726
base: main
Are you sure you want to change the base?
Conversation
Deploying datachain-documentation with Cloudflare Pages
|
50cf27d
to
08e1e75
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #726 +/- ##
=======================================
Coverage 87.43% 87.43%
=======================================
Files 114 114
Lines 10967 10969 +2
Branches 1508 1509 +1
=======================================
+ Hits 9589 9591 +2
Misses 998 998
Partials 380 380
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
08e1e75
to
73b68b9
Compare
73b68b9
to
4f66965
Compare
@@ -840,13 +840,17 @@ def test_listing_stats(cloud_test_catalog): | |||
|
|||
catalog.enlist_source(f"{src_uri}/dogs/", update=True) | |||
stats = listing_stats(src_uri, catalog) | |||
assert stats.num_objects == 7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[C]: not sure why we expected here any change to the src_uri
listing, after listing the subset. It seems it is better to create a second listing for /dogs
in this case (or at least update the whole src_uri
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @ilongin you might know this better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right and this was a mistake / bug in the listing. Creating new separate /dogs
seems like correct thing to do here.
4f66965
to
8ffe1d4
Compare
@@ -430,7 +430,11 @@ def parse_uri( | |||
if listings: | |||
if update: | |||
# choosing the smallest possible one to minimize update time | |||
listing = sorted(listings, key=lambda ls: len(ls.name))[0] | |||
listing = sorted(listings, key=lambda ls: len(ls.name), reverse=True)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ilongin why parse_uri
is exposed externally? also it's a bit of a strange name tbh. Could you please give more context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should not be exposed. Probably this can be removed from DataChain
completely in the first place ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kk, I'll move it
Partially addresses and stabilizes #725