Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow data versioning #31

Closed
trvrb opened this issue Sep 9, 2016 · 3 comments
Closed

Allow data versioning #31

trvrb opened this issue Sep 9, 2016 · 3 comments
Labels
enhancement New feature or request proposal Proposals that warrant further discussion

Comments

@trvrb
Copy link
Member

trvrb commented Sep 9, 2016

Version datasets with augur: /data/43922349852/flu_h2n2_3y_tree.json, etc...

Then keep the version ID 43922349852 in auspice params. This would allow deep-linking back to a previous augur build.

Developing this would work like custom augur builds.

@trvrb trvrb added the Epic label Sep 9, 2016
@jameshadfield jameshadfield added enhancement New feature or request high priority and removed Epic labels Mar 2, 2018
@jameshadfield jameshadfield added proposal Proposals that warrant further discussion and removed high priority labels Oct 30, 2018
@trvrb
Copy link
Member Author

trvrb commented Jan 22, 2021

We've effectively settled on a data versioning schema with ncov_global_2020-04-15.json / https://nextstrain.org/ncov/global/2020-04-15. We can continue to promote workflows that mirror this pattern. Is there anything else we'd like to do here? Or should we consider "data versioning" complete for the time being?

@tsibley
Copy link
Member

tsibley commented Jan 23, 2021

My 2¢ is that it's reasonable to promote explicit versions in dataset names for builds that want them. They're a simple approach for now that requires no central coordination.

That said, I do think there's more sophisticated versioning we could provide automatically and transparently (think S3 object versions or git history) and then make accessible via version selectors in the UI and URL (e.g. similar to how we support branch selectors in community URLs currently). The upshot is that versioning then "just happens" without buy in or pushing that complexity down into each build. However, I don't think this issue needs to remain open to track that idea.

@jameshadfield
Copy link
Member

Is there anything else we'd like to do here? Or should we consider "data versioning" complete for the time being?

I'm happy running with time-stamped dataset names as a solution that everyone understands.

That said, I do think there's more sophisticated versioning we could provide automatically and transparently (think S3 object versions or git history) and then make accessible via version selectors in the UI and URL (e.g. similar to how we support branch selectors in community URLs currently). The upshot is that versioning then "just happens" without buy in or pushing that complexity down into each build. However, I don't think this issue needs to remain open to track that idea.

Yup. This is nextstrain/nextstrain.org#196 and can be tracked there as it's a nextstrain.org implementation rather than auspice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request proposal Proposals that warrant further discussion
Projects
None yet
Development

No branches or pull requests

3 participants