Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose path minting as a microservice #639

Closed
dannylamb opened this issue May 23, 2017 · 10 comments
Closed

Expose path minting as a microservice #639

dannylamb opened this issue May 23, 2017 · 10 comments

Comments

@dannylamb
Copy link
Contributor

dannylamb commented May 23, 2017

Blocks: #640

Create a small microservice that mints paths for new resources in Fedora to be created with PUT. This will allow us to avoid using POST to create resources and rely only on idempotent methods in islandora-indexing-fcrepo. This also will allow power users the ability to control their repository structure by swapping in their own path minting strategy. Default behaviour should be that of the reference implementation's default path minter.

@whikloj
Copy link
Member

whikloj commented May 23, 2017

So are you thinking we mimic the same action that Fedora uses right now (ie. a sha1 hash) and allow it to be replaced by whatever someone wants to write?

@dannylamb
Copy link
Contributor Author

@whikloj Exactly

@whikloj
Copy link
Member

whikloj commented May 23, 2017

@dannylamb cool 👍

I was looking in the Fedora source code and I'm not certain where it uses the sha-1. I'm actually wondering if it just uses a random UUID to generate the path, not connected to the resource stored at that location via HierarchicalIdentifierSupplier.

This is just informative and interesting, but unrelated as I think we should figure a way to generate a consistent ID from a resource.

@DiegoPino
Copy link
Contributor

@dannylamb @whikloj can we use the drupal UUID for that?

@dannylamb
Copy link
Contributor Author

@DiegoPino Use the default implementation of https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Component%21Uuid%21UuidInterface.php/interface/UuidInterface/8.2.x in Silex? Sure. I'm all about re-use.

@dannylamb
Copy link
Contributor Author

@whikloj I wasn't aware Fedora minted the path based on the resource itself. I figured it was just random.

@whikloj
Copy link
Member

whikloj commented May 24, 2017

@dannylamb it is random, however we might be better off to try and make it reproducible. That way we can have the object go into Fedora at the same path regardless of how many times you try to add it.

This could be a stretch goal, to avoid getting held back.

@DiegoPino
Copy link
Contributor

DiegoPino commented May 24, 2017

I probably did not get this right so will try to explain myself again.
If we want to make the path reproducible then we need some value in the call of this PID minter that stays constant. It can't be just a call. Trying to hash RDF itself is complex because every serialisation can have a different order and even Namespacing can get in between, so I was proposing in my previous post passing to the microservice the Drupal Content Entity UUID value (the one that drupal generates) and use that directly (easier), or as salt for a UUIDV5 which is a predictable, non random UUID.

Maybe the use case of this is not drupal related at all, and in that case i got it wrong

@dannylamb
Copy link
Contributor Author

dannylamb commented May 25, 2017

Ok, I gotcha. Yeah, so I'm in agreement all around here. Making it consistently generate a path is going to save a lot of headaches, and we can't hash the RDF to consistently get one. So using something from the resource makes sense here. If we're trying to mimic the reference implmentation's default, then using the Drupal UUID is as good as any IMO.

So we'll need to stick that back into the RDF. From conversations at the CLAW 'un-conference' session, people seemed keen on putting a namespaced URN in dc:identifier. I'd prefer that to avoid minting a new predicate if possible. Are we ok with that? If there was more than one value we'd need to check for that namepsace, but that should work for us.

Also, once there's an implementation that doesn't need the nesting for performance reasons, I could also see us stealing the path portion of the resource uri as well (e.g. "media/2" or "sites/default/files/blah.jpeg") and using that. But it's not an option at the moment.

@dannylamb
Copy link
Contributor Author

Resolved via Islandora/Crayfish@3167ad1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants