Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: PolarisFileSystem uses the dataset name instead of the dataset slug, causing a 404 #146

Merged
merged 8 commits into from
Jul 21, 2024

Conversation

zhu0619
Copy link
Contributor

@zhu0619 zhu0619 commented Jul 21, 2024

Changelogs

  • Sluggify the dataset name in the PolarisFileSystem.
  • Serialize the cache_dir to a string.
  • Updated Zarr dataset tutorial to reflect recent changes.

Checklist:

- [ ] Was this PR discussed in an issue? It is recommended to first discuss a new feature into a GitHub issue before opening a PR.
- [ ] Add tests to cover the fixed bug(s) or the newly introduced feature(s) (if appropriate).
- [ ] Update the API documentation if a new function is added, or an existing one is deleted.

  • Write concise and explanatory changelogs above.
  • If possible, assign one of the following labels to the PR: feature, fix, chore, documentation or test (or ask a maintainer to do it for you).

When uploading the dataset, the dataset slug has to be used to get the correct path to the R2 bucket. Otherwise, the Hub returns a 404.

@zhu0619 zhu0619 requested a review from cwognum as a code owner July 21, 2024 08:05
@zhu0619 zhu0619 requested a review from Andrewq11 July 21, 2024 08:09
@cwognum cwognum added the fix Annotates any PR that fixes bugs label Jul 21, 2024
Copy link
Collaborator

@cwognum cwognum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zhu0619 ! Just some minor suggestions.

@cwognum cwognum changed the title Upload zarr dataset to correct zarr dataset path Fix: PolarisFileSystem uses the dataset name instead of the dataset slug, causing a 404 Jul 21, 2024
@zhu0619 zhu0619 merged commit 4888a9b into main Jul 21, 2024
4 checks passed
@cwognum
Copy link
Collaborator

cwognum commented Jul 21, 2024

Hey @zhu0619 ! Seems I'm too late, but one more thought came to mind when I saw you push the latest changes: Do we need to sluggify the owner too? I don't think so, because even though we type-hint dataset_owner as a str, I believe it's actually a HubOwner object which returns its slug as string representation.

Maybe we should just drop the dataset_name and dataset_owner and use the artifact_id instead. It's only used for the prefix anyways. We can leave things as is for now!

@cwognum cwognum deleted the feat/zarr_upload branch July 21, 2024 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Annotates any PR that fixes bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

polaris.utils.errors.PolarisHubError: Error opening Zarr store at dataset upload
2 participants