Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nextstrain remote download produces bad local filenames when periods are in the dataset name #380

Closed
tsibley opened this issue Jul 17, 2024 · 2 comments · Fixed by #381
Closed
Assignees
Labels
bug Something isn't working source: discussion forum Issue mentioned on Nextstrain Discussions

Comments

@tsibley
Copy link
Member

tsibley commented Jul 17, 2024

$ nextstrain remote download https://nextstrain.org/groups/niph/2022.04.29-ncov/omicron-BA-two
Downloading https://nextstrain.org/groups/niph/2022.04.29-ncov/omicron-BA-two as 2022.04.json
Downloading https://nextstrain.org/groups/niph/2022.04.29-ncov/omicron-BA-two (root-sequence) as 2022.json
Downloading https://nextstrain.org/groups/niph/2022.04.29-ncov/omicron-BA-two (tip-frequencies) as 2022.json
$ nextstrain version
nextstrain.cli 8.4.0

Note the local filenames! 2022.04.json and 2022.json (written twice!)

They should be:

2022.04.29-ncov_omicron-BA-two.json
2022.04.29-ncov_omicron-BA-two_root-sequence.json
2022.04.29-ncov_omicron-BA-two_tip-frequencies.json

Additional context

Reported on our discussion forum.

@tsibley tsibley added the bug Something isn't working label Jul 17, 2024
@tsibley
Copy link
Member Author

tsibley commented Jul 17, 2024

This seems like a bug in our use of Path objects to munge stems/extensions.

@tsibley
Copy link
Member Author

tsibley commented Jul 17, 2024

Specifically this logic:

if not subresource.primary:
destination = destination.with_name(f"{destination.with_suffix('').name}_{sidecar_suffix(subresource.media_type)}")
destination = destination.with_suffix(subresource.file_extension)

@victorlin victorlin added the source: discussion forum Issue mentioned on Nextstrain Discussions label Jul 18, 2024
@tsibley tsibley self-assigned this Jul 18, 2024
tsibley added a commit that referenced this issue Jul 23, 2024
…destination bug

The root of the problem is the same: dots are not being handled
properly.  They're assumed to always indicate an extension, but we can
and should support non-extension dots.

Fails with:

    Expected:
        ['mpox.clade-IIb.json',
         'mpox.clade-IIb_root-sequence.json',
         'mpox.clade-IIb_tip-frequencies.json',
         'mpox.clade-IIb_measurements.json']
    Got:
        ['mpox.json', 'mpox_root-sequence.json', 'mpox_tip-frequencies.json', 'mpox_measurements.json']

and

    Expected:
        ['2022.04.29-ncov_omicron-BA-two.json',
         '2022.04.29-ncov_omicron-BA-two_root-sequence.json',
         '2022.04.29-ncov_omicron-BA-two_tip-frequencies.json',
         '2022.04.29-ncov_omicron-BA-two_measurements.json']
    Got:
        ['2022.04.json', '2022.json', '2022.json', '2022.json']

The latter failure reproduces a user report of issues downloading.  The
former was a closely-related failure case I noticed while debugging the
latter.

Related-to: <#380>
tsibley added a commit that referenced this issue Jul 23, 2024
Avoids careless Path.with_suffix() usage.

Resolves: <#380>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working source: discussion forum Issue mentioned on Nextstrain Discussions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants