-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plumb HTTPS backend through CLI and adjust it to use full storage prefixes #1597
Conversation
@jhiemstrawisc How do I test this? |
The config in the PR description should be functional. You can use it for the basis of a test if you're setting things up with yaml-based configuration. Otherwise, you should be able to start the origin based on the new CLI args alone. |
So, I should create an https based origin and see if it works? Just want to make sure that's the goal. |
The two things to check are:
|
…fixes This PR accomplishes two things -- first, it plumbs a few of the https backend components through the CLI, allowing users to serve minimal https origins without writing a configuration yaml. Second, it adjust the https backend to use storage prefixes. Consider the following configuration: ``` Origin: StorageType: "https" HttpServiceUrl: "https://data.lhncbc.nlm.nih.gov/public" Exports: - StoragePrefix: "/Tuberculosis-Chest-X-ray-Datasets/Montgomery-County-CXR-Set/MontgomerySet/CXR_png" FederationPrefix: "/my-prefix" ``` The adjustments cause a request for object `/my-prefix/MCUCXR_0005_0.png` to be converted to a libCurl request in the Origin for `https://data.lhncbc.nlm.nih.gov/public/Tuberculosis-Chest-X-ray-Datasets/Montgomery-County-CXR-Set/MontgomerySet/CXR_png/MCUCXR_0005_0.png` While we don't yet support multiple exports for the https backend, I think this configuration affords us maximum flexibility for the future where we do by allowing us to set the service URL (every request to the origin uses this as the base URL), a storage prefix (each namespace can then carve out some section of data hosted by the service url), and the typical federation prefix mapping (i.e. strip the fed prefix from the object and tack that value to the end of the service url + storage prefix). While this is a slight change in how an http origin would be configured, I'm not worried about backwards compat because a) we never documented the limitations of this, even within our own codebase and b) this makes the https backend conformant with the way we use storage prefixes in every other backend.
These started failing after changes to the https backend. After inspection, I'm surprised they'd been passing at all, and am not convinced they were actually testing what we wanted them to. At the very least, they were grabbing config from my system installation of Pelican (NAUGHTY), and this correctly isolates them.
bdb15f4
to
95cac30
Compare
With the latest commit, the three modes of configuration to check are:
Each of these configurations should allow you to get the public object |
One thing to note is that I'm punting on the general cleanup of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jhiemstrawisc When I test this with export PELICAN_ORIGIN_STORAGEPREFIX=/Tuberculosis-Chest-X-ray-Datasets/Montgomery-County-CXR-Set/MontgomerySet/CXR_png/
it is still failing due to the trailing /
.
This PR accomplishes two things -- first, it plumbs a few of the https backend components through the CLI, allowing users to serve minimal https origins without writing a configuration yaml.
Second, it adjust the https backend to use storage prefixes. Consider the following configuration:
The adjustments cause a request for object
/my-prefix/MCUCXR_0005_0.png
to be converted to a libCurl request in the Origin forhttps://data.lhncbc.nlm.nih.gov/public/Tuberculosis-Chest-X-ray-Datasets/Montgomery-County-CXR-Set/MontgomerySet/CXR_png/MCUCXR_0005_0.png
While we don't yet support multiple exports for the https backend, I think this configuration affords us maximum flexibility for the future where we do by allowing us to set the service URL (every request to the origin uses this as the base URL), a storage prefix (each namespace can then carve out some section of data hosted by the service url), and the typical federation prefix mapping (i.e. strip the fed prefix from the object and tack that value to the end of the service url + storage prefix).
While this is a slight change in how an http origin would be configured, I'm not worried about backwards compat because a) we never documented the limitations of this, even within our own codebase and b) this makes the https backend conformant with the way we use storage prefixes in every other backend.
Closes #1279