Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote download: Names with dots #381

Merged
merged 7 commits into from
Jul 24, 2024
Merged

Conversation

tsibley
Copy link
Member

@tsibley tsibley commented Jul 23, 2024

See commit messages.

Resolves #380.

Checklist

  • Checks pass

tsibley added 7 commits July 18, 2024 11:34
…into a separate function so that I can add doctests to it.
Demonstrate and validate our basic expectations.  In the next commit,
I'll add a failing test for a bug.
…destination bug

The root of the problem is the same: dots are not being handled
properly.  They're assumed to always indicate an extension, but we can
and should support non-extension dots.

Fails with:

    Expected:
        ['mpox.clade-IIb.json',
         'mpox.clade-IIb_root-sequence.json',
         'mpox.clade-IIb_tip-frequencies.json',
         'mpox.clade-IIb_measurements.json']
    Got:
        ['mpox.json', 'mpox_root-sequence.json', 'mpox_tip-frequencies.json', 'mpox_measurements.json']

and

    Expected:
        ['2022.04.29-ncov_omicron-BA-two.json',
         '2022.04.29-ncov_omicron-BA-two_root-sequence.json',
         '2022.04.29-ncov_omicron-BA-two_tip-frequencies.json',
         '2022.04.29-ncov_omicron-BA-two_measurements.json']
    Got:
        ['2022.04.json', '2022.json', '2022.json', '2022.json']

The latter failure reproduces a user report of issues downloading.  The
former was a closely-related failure case I noticed while debugging the
latter.

Related-to: <#380>
Avoids careless Path.with_suffix() usage.

Resolves: <#380>
…ar suffix handling

Mixed subresource extensions work due to the previous bug fix in
"remote.nextstrain_dot_org: Fix download dots-in-destination bug"
(5d9533f) but empty sidecar suffixes on non-primary subresources do not.

I noted this edge case when working on other fixes in this code.  We
don't currently run into this edge case, but it's good to consider and
prevent anyway to avoid future surprises if we ever do.  Our data model
makes it possible and it's not an unreasonable use case.

The first test fails with:

    Expected:
        ['baz.md', 'baz.json', 'baz_root-sequence.json']
    Got:
        ['baz.md', 'baz_.json', 'baz_root-sequence.json']

and the subsequent similar tests fail similarly.
…ling

Don't assume that all non-primary subresources will have sidecar suffixes.
@tsibley tsibley merged commit d897399 into master Jul 24, 2024
42 checks passed
@tsibley tsibley deleted the trs/remote-download/names-with-dots branch July 24, 2024 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nextstrain remote download produces bad local filenames when periods are in the dataset name
3 participants