Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(listing): pick actually the smallest one to update #726

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shcheklein
Copy link
Member

Partially addresses and stabilizes #725

@shcheklein shcheklein added the bug Something isn't working label Dec 20, 2024
@shcheklein shcheklein self-assigned this Dec 20, 2024
Copy link

cloudflare-workers-and-pages bot commented Dec 20, 2024

Deploying datachain-documentation with  Cloudflare Pages  Cloudflare Pages

Latest commit: 8ffe1d4
Status: ✅  Deploy successful!
Preview URL: https://5a06633e.datachain-documentation.pages.dev
Branch Preview URL: https://fix-listing-selection.datachain-documentation.pages.dev

View logs

Copy link

codecov bot commented Dec 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.43%. Comparing base (20c73b2) to head (8ffe1d4).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #726   +/-   ##
=======================================
  Coverage   87.43%   87.43%           
=======================================
  Files         114      114           
  Lines       10967    10969    +2     
  Branches     1508     1509    +1     
=======================================
+ Hits         9589     9591    +2     
  Misses        998      998           
  Partials      380      380           
Flag Coverage Δ
datachain 87.37% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -840,13 +840,17 @@ def test_listing_stats(cloud_test_catalog):

catalog.enlist_source(f"{src_uri}/dogs/", update=True)
stats = listing_stats(src_uri, catalog)
assert stats.num_objects == 7
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[C]: not sure why we expected here any change to the src_uri listing, after listing the subset. It seems it is better to create a second listing for /dogs in this case (or at least update the whole src_uri).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @ilongin you might know this better

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right and this was a mistake / bug in the listing. Creating new separate /dogs seems like correct thing to do here.

@@ -430,7 +430,11 @@ def parse_uri(
if listings:
if update:
# choosing the smallest possible one to minimize update time
listing = sorted(listings, key=lambda ls: len(ls.name))[0]
listing = sorted(listings, key=lambda ls: len(ls.name), reverse=True)[0]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilongin why parse_uri is exposed externally? also it's a bit of a strange name tbh. Could you please give more context?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this should not be exposed. Probably this can be removed from DataChain completely in the first place ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kk, I'll move it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants