Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When metadata_url is provided, use it #63

Merged
merged 3 commits into from
Dec 3, 2024
Merged

Conversation

elray1
Copy link
Collaborator

@elray1 elray1 commented Nov 25, 2024

We are leaving this PR open for now to give Becky a chance to share input (generally, and also about how to test this change).

This is the test that we really want to have, but don't. The problem is
that checking that polars will happily give you a LazyFrame that
contains nothing at all.
@zkamvar
Copy link
Member

zkamvar commented Nov 25, 2024

I added a test that we didn't have and a note about a test that we want (which I would appreciate Becky's expertise on). FWIW, this was caused by new behavior added in #61 (specifically in f57af99) that allowed us to read from xz files.

The correct procedure was followed to ensure that the code was covered, but the bug that was introduced highlighted missing specificity for these tests. It turns out that a LazyFrame is still a LazyFrame whether or not you put anything inside it.

zkamvar added a commit to reichlab/variant-nowcast-hub that referenced this pull request Nov 25, 2024
NOTE: do not merge this until reichlab/cladetime#63 is merged
@bsweger
Copy link
Collaborator

bsweger commented Dec 2, 2024

@zkamvar Thanks for tackling this!

You're correct about the lack of specificity (to date, the mocked S3 test setup was mostly designed to test cladetime's ability to find the correct version of a file in S3, rather than using actual file content)

I'll work on incorporating better representations of data into the test setup.

My best hypothesis after some digging is that presigned urls are problematic w/ polars read/scan_csv methods because polars issues a head request followed by a get, and the presigned urls can only work with a single method.

It's up to you and @elray1...we can merge the patch sans tests so our scripts can work against the main branch while I'm working on this, or we can wait until the test suite is in proper shape.

@elray1
Copy link
Collaborator Author

elray1 commented Dec 2, 2024

I vote for merge now, test later. We all understand what the bug was and it is a clear fix. The annoyance of keeping things around pointing to branches and open PRs lingering seems more severe to me than the risk of introducing an error in this PR.

Copy link
Collaborator

@bsweger bsweger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this with the understanding that we should fix the test specificity noted by @zkamvar as a fast follow.

Will leave the comments intact as a placeholder.

@bsweger bsweger merged commit 7e5e06f into main Dec 3, 2024
2 checks passed
@bsweger bsweger deleted the elr/use_metadata_url branch December 3, 2024 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants