Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Universal Viewer stops working after 2.8.0 --> 2.9.0 upgrade #4572

Closed
gwiedeman opened this issue Nov 6, 2020 · 9 comments
Closed

Universal Viewer stops working after 2.8.0 --> 2.9.0 upgrade #4572

gwiedeman opened this issue Nov 6, 2020 · 9 comments
Labels
Milestone

Comments

@gwiedeman
Copy link

Descriptive summary

  • Universal Viewer appears to work just fine in Hyrax 2.8.0.

  • Changing the Hyrax version in the Gemfile to 2.9.0 and running bundle update hyrax causes Universal Viewer to
    hyraxIssue

  • In looking at Troubleshooting RIIIF, it seems like iiif_info_url_builder is returning the wrong file ID and Fedora is returning an empty text file instead of the file object.

2.6.5 :015 > Riiif::Engine.routes.url_helpers.info_url(fs.files.first.id, host: "http://localhost:3000")
 => "http://localhost:3000/images/2f75rg51j%2Ffiles%2Fd1420cd8-a605-4ecd-8197-9e0f0862df5f/info.json"
  • This does return a manifest, but iiif_image_url_builder then returns an incorrect URL for the file:
2.6.5 :016 > Riiif::Engine.routes.url_helpers.image_url(fs.files.first.id, host: "http://localhost:3000", size: "100,")
 => "http://localhost:3000/images/2f75rg51j%2Ffiles%2Fd1420cd8-a605-4ecd-8197-9e0f0862df5f/full/100,/0/default.jpg"
  • This URL returns a 500.

  • From navigating the working University Viewer in 2.8.0, it looks like the correct file ID is 831e33b8-88c8-4922-af03-3cfa96c1f2af, not d1420cd8-a605-4ecd-8197-9e0f0862df5f. Changing that ID in the URL returned by iiif_info_url_builder also returns a manifest, and changing it in iiif_image_url_builder correctly returns the file.

  • So it seems that the 2.9.0 upgrade is correctly using @id from the manifest when the wrong @id in 2.8.0 did not matter.

  • It looks like iiif_info_url_builder is relying on fs.files.first.id to return the ID. This returns a Fedora URI with the wrong ID, which serves an empty .txt file. fs.files.last.id Returns the correct ID and Fedora serves the file. So it seems like all of our FileSets have a bad first file and I'm not sure if this is something that is supposed to happen or if this is something that happened in a previous Hyrax version, or if this is totally a local issue and we should just delete all these.

Rationale

2.8.0 to 2.9.0 should be a non-breaking update from what I can tell in the release notes.

Expected behavior

Universal Viewer displays the image.

Actual behavior

Steps to reproduce the behavior

  1. Changing the Hyrax version in the Gemfile to 2.9.0
  2. bundle update hyrax
  3. Restart

Related work

Troubleshooting RIIIF was helpful.

@no-reply no-reply added this to the 2.x series milestone Dec 12, 2020
@no-reply no-reply added the bug label Dec 12, 2020
@no-reply
Copy link
Contributor

no-reply commented Dec 12, 2020

thanks @gwiedeman!

it seems like there are a couple of issues at play here. as i dig into this properly, here's some preliminary information:

Hyrax 2.9.0 includes a complete rewrite of the IIIF Manifest generation process. this rewrite is a substantial performance improvement on manifest generation, especially for Works with many FileSets. all the code that powered the old behavior is still in the release, and it can be reinstated by adding the following to controllers including Hyrax::WorksControllerBehavior:

def iiif_manifest_presenter
  presenter
end

one known change in the generation process is that we no longer use the Hyrax request context to determine the protocol to be used by the IIIF server. in practice what it means is that, for apps using Riiif and HTTPS, the protocol for Riiif URLs (as generated by Riiif::Engine.routes.url_helpers.image_url via Hyrax.config.iiif_image_url_builder) is dependent on Riiif::Engine configuration, not the protocol you're using to get the manifest. in theory, this ought to be an improvement, but it seems to be causing issues for some app deployments (notably https://nurax-dev.curationexperts.com). see, for example:

http://nurax-dev.curationexperts.com/images/p5547r38s%2Ffiles%2F358aab77-5df3-4224-82e3-47e01b8add99%2Ffcr:versions%2Fversion1/full/600,/0/default.jpg
vs
https://nurax-dev.curationexperts.com/images/p5547r38s%2Ffiles%2F358aab77-5df3-4224-82e3-47e01b8add99%2Ffcr:versions%2Fversion1/full/600,/0/default.jpg

if your Riiif is sitting behind a server enforcing SSL, putting this in your initializer may fix your problem Riiif::Engine.routes.default_url_options = { protocol: 'https' }

in retrospect, this change may have been a mistake (or maybe is too aggressive a behavioral change for a minor release). i'm considering ways to roll it back without losing the performance improvements referenced above. feedback would be valued.

@no-reply
Copy link
Contributor

@gwiedeman your issue looks (at my best first guess) like it is due to neither of the above issues.

the file id for a given file_set shouldn't be calling #first as in file_set.files.first.id, instead, i think we should be looking for mismatches between the index and Fedora. for a FileSet that has this problem, can you check FileSet.find(id).original_file vs the indexed original_file_id in your solr instance?

i think in addition to the errant/extra text file (no idea where this might have come from, can you point me to your code base?), these objects must have a weird index state. the new manifest generation relies much more heavily on the index to understand which files to render.

that's the best i've got.

@no-reply
Copy link
Contributor

after #4663, nurax-dev.curationexperts.com has working IIIF via UV again. i'm not closing this out at this point, since @gwiedeman's issue is still undiagnosed, but i think it's no longer a blocker for 3.0.0.

@gwiedeman
Copy link
Author

Thanks for looking at this @no-reply !

Our codebase is here if that helps: https://github.com/UAlbanyArchives/hyrax-UAlbany

Its very possible this is an indexing issue. We also just updated from Solr 7 to 8.6 recently before updating to 2.9.0. fs.original_file returns the correct ID, and this is the same as the original_file_id shown in the console after fs.to_solr. However, when I query the FileSet in the Solr console I do not see an original_file_id key in the document as I would expect. This is the response:

"response":{"numFound":1,"start":0,"maxScore":16.77106,"numFoundExact":true,"docs":[
      {
        "system_create_dtsi":"2019-02-02T00:28:42Z",
        "system_modified_dtsi":"2019-02-02T00:28:54Z",
        "has_model_ssim":["FileSet"],
        "id":"2f75rg51j",
        "accessControl_ssim":["Not sure if this needs to be omitted"],
        "depositor_ssim":["importer@albany.edu"],
        "depositor_tesim":["importer@albany.edu"],
        "title_tesim":["nam_ua_390_pm_026037.jpg"],
        "date_uploaded_dtsi":"2019-02-02T00:28:42Z",
        "date_modified_dtsi":"2019-02-02T00:28:42Z",
        "creator_tesim":["importer@albany.edu"],
        "thumbnail_path_ss":"/downloads/2f75rg51j?file=thumbnail",
        "hasRelatedMediaFragment_ssim":["2f75rg51j"],
        "hasRelatedImage_ssim":["2f75rg51j"],
        "label_tesim":["nam_ua_390_pm_026037.jpg"],
        "label_ssi":"nam_ua_390_pm_026037.jpg",
        "file_format_tesim":["jpeg (JPEG File Interchange Format)"],
        "file_size_lts":19517703,
        "height_is":4840,
        "width_is":6586,
        "visibility_ssi":"open",
        "mime_type_ssi":"image/jpeg",
        "digest_ssim":["urn:sha1:bfe876228776081f1bf7c9c3d0b3c481e71863c5"],
        "original_checksum_tesim":["04400b96f8c5d5fc7c00e4688a7e226d"],
        "read_access_group_ssim":["public"],
        "edit_access_group_ssim":["content-admin"],
        "human_readable_type_tesim":["File"],
        "_version_":1624314830176387072,
        "timestamp":"2019-02-02T00:28:55.297Z",
        "score":16.77106}]
  },

It is weird that fs.files returns two Hydra::PCDM::File objects when there is only one file in the fileset? The other being the empty text file. Is possible that this is bad local data.

I don't suspect its an https issue, as I get the same behavior on our dev server without using ssl, but I'll look into this more.

Also, as for how aggressive to make changes for a minor release, I think its okay. It think its expected that you should test even minor releases and that's how we found the issue. We're really excited about the many FileSets performance upgrades and the barriers should be as low as possible for these types of improvements.

@no-reply
Copy link
Contributor

thanks for the update @gwiedeman.

i agree that it's weird you're ending up with multiple files, and i'd like to dig into that more deeply with you.

focusing briefly on the index issue, i wonder if you could try fs.update_index fixes the issue for you? if to_solr has the original file, then i'd really expect it to be in the solr document, and it being missing feels like it's certainly at least part of the problem here.

on the other hand, we should be falling through to a Fedora lookup on #original_file in the case that the indexed original_file_id isn't present. do you have entries in your log like "original_file_id for [some id] not found, falling back to Fedora."?

@gwiedeman
Copy link
Author

Ah, I forgot that .to_solr doesn't actually update the index. fs.update_index does add a original_file_id_ssi key in the index:

"original_file_id_ssi":"2f75rg51j/files/831e33b8-88c8-4922-af03-3cfa96c1f2af/fcr:versions/version1",

The ID in there is correct, and appending it to the Fedora URL does return a file from Fedora, but Universal Viewer still shows the same error in 2.9.0.

As for the rails logs, I do get that warning when loading an object that I didn't .update_index for, but the warning stops appearing after fs.update_index which seems to be as expected:

W, [2020-12-16T15:41:04.347069 #8597]  WARN -- : [37ca1658-3b7e-4bdd-b61c-d959d5cad771] original_file_id for nc580t73h not found, falling back to Fedora.

Apart from that, I do get this when I load the FileSet page, but that's it:

E, [2020-12-16T15:24:24.995426 #4954] ERROR -- : [3292abc1-487e-49bc-9b17-157ed1ca372e] unable to fetch event: FileSet:2f75rg51j:event

Even after turning up the log level

@gwiedeman
Copy link
Author

I finally sorted this out when I noticed I was getting a CORS error on our dev server because the object on my.dev.server:3000/concern/images/6t053p39d was requesting the manifest at my.dev.server/images/2f75rg51j%2Ffiles%2F831e33b8-88c8-4922-af03-3cfa96c1f2af/info.json (without the port number).

This was odd as I was getting the same result on the production server, but that issue was due to SSL enforcement, so @no-reply 's suggestion of adding Riiif::Engine.routes.default_url_options = { protocol: 'https' } to config/initalizers/riiif.rb resolved it.

I'll leave this open, as I'm not sure if its worth trying to get the manifest to use rails_root or whatever to use the port number. I would probably recommend adding the riiif initalizer option to the 2.9.0 release notes, but I still think its totally fair for a minor release.

Thanks @no-reply for spending some time on this!

@no-reply
Copy link
Contributor

thanks for the follow-up @gwiedeman.

based on this update, i think the patch in #4663 ought to fix this without need for the default_url_options override. i'll propose a 2.9.2 release later today which should prevent other folks from running into issues.

once that's out, we should be able to close this out.

@no-reply
Copy link
Contributor

no-reply commented Jan 4, 2021

closing this as fixed by the release of 2.9.2

@no-reply no-reply closed this as completed Jan 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants