Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Component URLs should separate the ead & ref slugs #1318

Closed
seanaery opened this issue Nov 11, 2022 · 5 comments · Fixed by #1478
Closed

Component URLs should separate the ead & ref slugs #1318

seanaery opened this issue Nov 11, 2022 · 5 comments · Fixed by #1478
Assignees

Comments

@seanaery
Copy link
Contributor

Summary

ArcLight currently creates URLs to individual components like this:
{host}/catalog/{ead_ssi}{ref_ssi}

So we get URLs like:
Collection: archives.institution.edu/catalog/slug <-- this is OK
Component: archives.institution.edu/catalog/slugaspace_123 <-- this is not ideal

Desired Change

This ticket is a request to improve the component URLs by using one of these conventions instead:
archives.institution.edu/catalog/slug/aspace_123
archives.institution.edu/catalog/slug_aspace_123

Reasons for concern over the smooshed together URL

  • Sometimes creates awkward unintended words (in practice, ead slugs often end in letters and ref values often begin with al or aspace
  • Harms SEO, i.e., makes it harder to Google using the EAD slug
  • Can make web analytics reports on per-collection activity harder to parse
  • Makes it harder to build future pattern-based redirects from these URLs to new ones
  • Not very readable

Current ArcLight Usage

No separation in Component URL

These apps follow default core:

Has separation in Component URLs by wedging an underscore between the ead & ref slugs

These apps wedge an underscore between the ead & ref slugs (overriding core):

Other Considerations

@gwiedeman & @jrochkind astutely note on code4lib#arclight Slack that if a change is made to core, it's important to be able to redirect from the old style to keep existing URLs working. And even if core stays how it is, I recommend making the convention configurable or otherwise straightforward to override globally.

@cbeer
Copy link
Member

cbeer commented Nov 11, 2022

#1317 is an idea for a seam for downstream applications to decide what style they want 🤷‍♂️

@jrochkind
Copy link
Member

jrochkind commented Nov 12, 2022

Harms SEO, i.e., makes it harder to Google using the EAD slug

It's a bit of an esoteric art and I'm not necessarily an expert, but I believe that /slug/aspace_123 will be a lot more reliable than /slug_aspace_123 with regard to that.

Can make web analytics reports on per-collection activity harder to parse

Depending on the tool you're using, I think same.

If you're going to do the work to make a change (globally or with a "seam" in an individual app), I predict you'll be a lot happier over the long-term, and less likely to need to change it again, if you can go to /. Even if might require messing with Rails routing a bit more.

@seanaery
Copy link
Contributor Author

@marlo-longley We are definitely still interested in prioritizing this work. In lieu of having a configurable way to do this, here's what we ended up having to modify in our local 1.0.1-based app:

  • Override the to_field 'id' from ead2_component_config.rb to join the two parts with an underscore (commit). This takes care of most of the logic for a component URL, tbh, and wasn't too heavy a lift, but...
  • Fix the breadcrumb links to parent components (commit). This was more of a pain and required several local overrides, esp. stemming from the use of app/models/arclight/parent.rb which includes a method global_id that assumes the component ID is "#{eadid}#{id}"

@marlo-longley
Copy link
Contributor

marlo-longley commented Dec 11, 2023

Thank you @seanaery. In earlier comments it seemed like some people had a strong preference for splitting the URL into a new path instead of the underscore separation (so that'd be /slug/aspace_123 rather than /slug_aspace_123) -- that's mainly what I wanted to make sure to consider before moving on a solution.

Your code is really helpful and seems much for straightforward in terms of routing to achieve. I will try to work something with this approach.

@gwiedeman
Copy link
Contributor

We would want to continue the current URLs in our local instance just for persistence, but I the default should probably be the slash (archives.institution.edu/catalog/slug/aspace_123). I think that is okay if its config that you could just change in catalog_controller or similar and its clearly stated in the release notes how to maintain older-style URLs.

marlo-longley added a commit that referenced this issue Dec 12, 2023
marlo-longley added a commit that referenced this issue Dec 12, 2023
marlo-longley added a commit that referenced this issue Dec 12, 2023
marlo-longley added a commit that referenced this issue Dec 12, 2023
marlo-longley added a commit that referenced this issue Dec 12, 2023
marlo-longley added a commit that referenced this issue Dec 12, 2023
@marlo-longley marlo-longley moved this from In Progress to In Review in Arclight Community Sprint 3 - 2023 Dec 12, 2023
seanaery pushed a commit that referenced this issue Dec 13, 2023
seanaery added a commit that referenced this issue Dec 13, 2023
…red in one place. Advances #1318

- Capture the concatenated/formatted IDs at indexing time, in parent_ids_ssim array
- Use that data instead of global_id
seanaery added a commit that referenced this issue Dec 13, 2023
…red in one place. Advances #1318

- Capture the concatenated/formatted IDs at indexing time, in parent_ids_ssim array
- Use that data instead of global_id
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

5 participants