Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assess sdr-client refactor to be more similar to our other client gems #261

Open
ndushay opened this issue Feb 15, 2023 · 14 comments
Open
Assignees

Comments

@ndushay
Copy link
Contributor

ndushay commented Feb 15, 2023

In doing research for the Folio Integration workcycle, I was chasing call stacks from the bottom up, and when I hit sdr-client I could't follow a large part of it.

The sdr-client gem, in a perfect world, could be refactored to be more like our other client gems (dor-services-client, preservation-client, folio-client ... whatever).

This ticket is to assess what it would take to do this refactoring and if we could do it in stages, or if it should be done all at once.

On consulting with @mjgiarlo, he said it would take approx 1 hour to do this assessment, and he would be delighted to do so after this week, when he is FR.

If it is just as much work to DO the refactor as to ASSESS the refactor, then ... please do it.

@justinlittman
Copy link
Contributor

For what it is worth, sdr client is nothing like our other clients in that it is not simply a thin layer over the API. It includes much more logic to (1) fill in the request with defaults / construct the request from incomplete information and (2) handle the complexities of the ActiveStorage interaction. This is very confusing and it is quite possible that this code is over-generalized / over-abstracted, but I think the underlying task is not simple.

It would be helpful to know more about the goals / intent of this ticket.

@ndushay
Copy link
Contributor Author

ndushay commented Feb 15, 2023

@justinlittman the intent is to make it understandable; I didn't understand the CLI stuff at all. Maybe that's a documentation problem. Maybe that's a "what are we doing with the sdr-api and why do we use it in some contexts and DSA in others" question, too. I think of myself as a reasonable "middle of the pack" canary with respect to "I can understand what this code is doing" ... and sdr-client made my head hurt.

from the README:

A primary design goal was for this to have as few dependencies as possible so that it can be easily distributed by gem install sdr-client and then it can be used as a CLI.

... where is it used for this? And should we keep supporting this?

@mjgiarlo
Copy link
Member

I wonder if this should be an issue or a dev planning topic or a story time or ... ?

@sul-dlss sul-dlss deleted a comment from ndushay Feb 15, 2023
@mjgiarlo
Copy link
Member

@ndushay 💬

@justinlittman the intent is to make it understandable; I didn't understand the CLI stuff at all. Maybe that's a documentation problem.

There is CLI documentation in the README: https://github.com/sul-dlss/sdr-client/#usage Maybe this misses the mark, though.

A primary design goal was for this to have as few dependencies as possible so that it can be easily distributed by gem install sdr-client and then it can be used as a CLI.

... where is it used for this? And should we keep supporting this?

We developed it so that we'd be prepared to meet the needs of campus users wanting direct SDR access (BLN & OpenNeuro were both suggested as potential partners). But the CLI's usage has been limited. I used it for a couple remediations---Catalhoyuk was the big one---and perhaps Andrew or Arcadia have used it as well?

FWIW, I do find sdr-client hard to reason about but the CLI isn't the part that I find most difficult. It's the fact that the client exposes all of its guts and expects the consumer to stitch together an interaction that makes sense. I do believe it was developed this way intentionally. But I also think it might be worth the time and effort to pause and do some analysis on how sdr-client is being used in hopes of perhaps improving the structure and maintainability of the code. Is that worth work cycle time now? I don't know.

@mjgiarlo
Copy link
Member

mjgiarlo commented Feb 15, 2023

Until we sort this out, I'm unassigning myself and backlogging this.

@mjgiarlo mjgiarlo removed their assignment Feb 15, 2023
@andrewjbtw
Copy link

I use it fairly regularly, but mostly for getting Cocina. The client started out with pretty limited options that made it only marginally useful for testing and not useful for anything I did with production data. Even though the intent is for external users to make use of it, the message I got in workcycles and in planning for workcycles was that it wasn't a priority to develop it out that way. It may cover more of my needs now but I haven't had a chance to really put it to use. I have at least one issue to file related to my recent attempts to use it for test data because it still doesn't appear to fully handle access rights when depositing.

I used the Fedora 3 API a lot when we were on Fedora 3 and likely would use the SDR API for similar tasks. But for various reasons, including system architecture that isn't related to the API or client itself, it hasn't worked out that way yet.

@ndushay ndushay changed the title assess sdr-client refactor to be more similar to our other sdr client gems assess sdr-client refactor to be more similar to our other client gems Mar 8, 2023
@mjgiarlo mjgiarlo moved this to In Progress (Not Ordered) in Infrastructure Portfolio Production Priorities Jan 12, 2024
@mjgiarlo mjgiarlo self-assigned this Jan 12, 2024
mjgiarlo added a commit that referenced this issue Jan 12, 2024
Fixes #261

TODO: write a better commit message later when this is more fully baked.
@mjgiarlo
Copy link
Member

Analysis for #326

Current usage (by consumer)

google-books

  • SdrClient::BackgroundJobResults.show (with a job ID)
  • SdrClient::Deposit.model_run (with a request DRO and files)
  • SdrClient::Login.run (with a URL and username/password)

infrastructure-integration-tests

  • SdrClient::BackgroundJobResults.show
  • SdrClient::Credentials.write
  • SdrClient::Deposit.run (with apo, collection, type, strategies, files, path)
  • SdrClient::Deposit::ImageFileSetStrategy
  • SdrClient::Deposit::MatchingFileGroupingStrategy
  • SdrClient::Deposit::SingleFileGroupingStrategy

argo

  • SdrClient::BackgroundJobResults.show
  • SdrClient::Connection.new (replace with proxy authN method)
  • SdrClient::Credentials.read (should no longer be needed since new authN method + no passing connections around)
  • SdrClient::Deposit::CreateResource.run (with accession bool, metadata as request DRO)
  • SdrClient::Deposit::FileMetadataBuilderOperations::MD5.for (with filepath)
  • SdrClient::Deposit::FileMetadataBuilderOperations::SHA1.for (with filepath)
  • SdrClient::Deposit::Files::DirectUploadRequest.new (with checksum, bytesize, content type, filename)
  • SdrClient::Deposit::UpdateDroWithFileIdentifiers.update (with request DRO and output from UploadFiles.upload)
  • SdrClient::Deposit::UploadFiles.upload (with file metadata, filepath map)
  • SdrClient::Login.run

happy-heron

  • SdrClient::Connection.new
  • SdrClient::Deposit::CreateResource.run (with same as above + assign_doi bool)
  • SdrClient::Deposit::Files::DirectUploadRequest.new
  • SdrClient::Deposit::UpdateDroWithFileIdentifiers.update
  • SdrClient::Deposit::UpdateResource.run (with request DRO and version description)
  • SdrClient::Deposit::UploadFiles.upload
  • SdrClient::Find.run (with druid)
  • SdrClient::Find::Failed
  • SdrClient::Login.run

sdr-client CLI

TO-DO

Unique list of current uses (based on the above)

  • Ones that act on a request DRO
    • SdrClient::Deposit.model_run (with a request DRO and files)... just wraps UploadFilesMetadataBuilder, UploadFiles, UpdateDroWithFileIdentifiers, and CreateResource
    • SdrClient::Deposit::CreateResource.run (with accession bool, metadata as request DRO, + assign_doi bool optionally)
    • SdrClient::Deposit::UpdateDroWithFileIdentifiers.update (with request DRO and output from UploadFiles.upload)
    • SdrClient::Deposit::UpdateResource.run (with request DRO and version description)
  • Ones that hinge on one or more files
    • SdrClient::Deposit.run (with apo, collection, type, strategies, files, path)
    • SdrClient::Deposit::Files::DirectUploadRequest.new (with checksum, bytesize, content type, filename)
    • SdrClient::Deposit::UploadFiles.upload (with file metadata, filepath map)
  • SdrClient::Find.run (with druid)
  • SdrClient::Login.run (with a URL and username/password)
  • Classes that are otherwise referenced
    • SdrClient::Find::Failed
    • SdrClient::Deposit::ImageFileSetStrategy
    • SdrClient::Deposit::MatchingFileGroupingStrategy
    • SdrClient::Deposit::SingleFileGroupingStrategy

Ideas for new interface

  • Put the new interface behind a new module: SdrClient::RedesignedClient
  • Use singleton architecture
    • This allows us to configure the client once with credentials, timeout values, a logger and have the whole client have access to central configuration, which will improve the current architecture which relies more on passing values along as params over and over
    • This is what we do with most/all other client gems in the infra portfolio
  • Replace SdrClient::Login.run with SdrClient::RedesignedClient.configure that takes a URL and either a username/password pair or a token
    • This is what we do with most/all other client gems in the infra portfolio
  • Allow client to refresh its token automagically if authN has expired
    • This is what we do with most/all other client gems in the infra portfolio and helps cut down on having to solve this at every consumer independently
  • Wire up smarter background job results handling
    • This is what we do with folio_client and dor-services-client already and helps cut down on having to navigate background job result interactions at every consumer independently
  • Replace Connection.new jazz with new method for proxy authN: given a sunetid, return a token
    • Why should consumers need to futz around with a connection object?
  • Replace file metadata builder operation methods with top-level methods that don't leak impl details (md5/sha1 modules 4-5 levels deep)
  • Put SdrClient::Find.run behind a top-level method
  • Stop exposing SdrClient::Deposit::UploadFiles.upload, SdrClient::Deposit::UpdateDroWithFileIdentifiers.update, and SdrClient::CreateResource.run (replace with SdrClient::RedesignedClient.deposit_model)
    • I believe all consumers are using these methods in sequence, which implies we're asking consumers to navigate implementation details that may not be necessary.
  • Replace SdrClient::Deposit::CreateResource.run with SdrClient::RedesignedClient.deposit_model(model:, accession: true|false, assign_doi: true|false, priority: low|default, files: []) that wraps the above sequence of interactions
  • Replace SdrClient::Deposit.model_run and SdrClient::Deposit::CreateResource with SdrClient::RedesignedClient.deposit_model(model:, files:, accession:, assign_doi:, priority:)
  • Replace SdrClient::Deposit.run with SdrClient::RedesignedClient.deposit_metadata(apo:, collection:, type:, strategies:, files:, path:)
  • Replace SdrClient::Deposit::UpdateResource.run with SdrClient::RedesignedClient.update_model(model:, version_description:)
  • Stop expecting clients to supply SdrClient::Deposit::Files::DirectUploadRequest instances, and look into using hashes instead

@ndushay
Copy link
Contributor Author

ndushay commented Jan 16, 2024

Can we shorten the new module name to, maybe SdrClient::Singleton or SdrClient2024?

Or, consider making a new gem and retiring the old. SdrApiClient ?

I like all your other suggestions.

@mjgiarlo
Copy link
Member

mjgiarlo commented Jan 16, 2024

@ndushay Happy to take suggestions on the new module name. I think if the team is on-board with the changes above, we ideally wouldn't be using the new namespace for very long: I was thinking we'd "promote" the new namespace back up in SdrClient wiping out all of the old code once all consumers have been moved to use the new interface. So the name only needs to be as good as something we'll ideally be seeing for a short while.

For the cutover process, I was thinking this would be the least labor-intensive (doesn't require Gemfile changes, or gem publishing shenanigans, unlike putting this work in a new gem).

Thanks for taking a look and dropping your thoughts on the issue!

@ndushay
Copy link
Contributor Author

ndushay commented Jan 17, 2024

I guess I think of touching all the calls in all the consumers twice as being more work than creating a new gem and touching them all once.

@ndushay
Copy link
Contributor Author

ndushay commented Jan 17, 2024

next step: discuss at dev planning?

@mjgiarlo
Copy link
Member

@ndushay 💬

I guess I think of touching all the calls in all the consumers twice as being more work than creating a new gem and touching them all once.

Some additional context to consider:

  1. The first touch is the costliest, which is when we move consumers to the new interface. The second touch is four small PRs (mechanical editor search-and-replace changes), and should require many fewer lines of code to be changed since we're shrinking the public interface of sdr-client.
  2. "All the calls" isn't that many calls, fortunately! And that number is even smaller after the first change.
  3. You might be underestimating the costs around creating a new gem and GitHub repo, most of which is more labor-intensive than an editor search-and-replace.

mjgiarlo added a commit that referenced this issue Mar 12, 2024
Fixes #261

TODO: write a better commit message later when this is more fully baked.
mjgiarlo added a commit that referenced this issue Mar 12, 2024
Fixes #261

TODO: write a better commit message later when this is more fully baked.
mjgiarlo added a commit that referenced this issue Mar 12, 2024
Fixes #261

TODO: write a better commit message later when this is more fully baked.
mjgiarlo added a commit that referenced this issue Mar 12, 2024
Fixes #261

TODO: write a better commit message later when this is more fully baked.
mjgiarlo added a commit that referenced this issue Mar 12, 2024
Fixes #261

TODO: write a better commit message later when this is more fully baked.
mjgiarlo added a commit that referenced this issue Mar 12, 2024
Fixes #261

TODO: write a better commit message later when this is more fully baked.
mjgiarlo added a commit that referenced this issue Mar 12, 2024
Fixes #261

TODO: write a better commit message later when this is more fully baked.
mjgiarlo added a commit that referenced this issue Mar 12, 2024
Fixes #261

TODO: write a better commit message later when this is more fully baked.
@github-project-automation github-project-automation bot moved this from In Progress (Not Ordered) to Resolved in Infrastructure Portfolio Production Priorities Apr 22, 2024
@mjgiarlo mjgiarlo reopened this Apr 22, 2024
@github-project-automation github-project-automation bot moved this from Resolved to In Progress (Not Ordered) in Infrastructure Portfolio Production Priorities Apr 22, 2024
mjgiarlo added a commit to sul-dlss/infrastructure-integration-test that referenced this issue Apr 25, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Apr 27, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Apr 30, 2024
@lwrubel lwrubel moved this from In Progress (Not Ordered) to Resolved in Infrastructure Portfolio Production Priorities May 8, 2024
@lwrubel lwrubel moved this from Resolved to In Progress (Not Ordered) in Infrastructure Portfolio Production Priorities May 8, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Jun 12, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Jun 12, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 19, 2024
mjgiarlo added a commit to sul-dlss/happy-heron that referenced this issue Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In Progress (Not Ordered)
Development

No branches or pull requests

4 participants