Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I769 snippets #2329

Merged
merged 5 commits into from
Sep 19, 2024
Merged

I769 snippets #2329

merged 5 commits into from
Sep 19, 2024

Conversation

kirkkwang
Copy link
Collaborator

@kirkkwang kirkkwang commented Sep 18, 2024

Story

🎁 Implement all_text searching in Valkyrie for PDF

54bf5d7

This commit will introduce the Hyku::Indexers::FileSetIndexer to add
indexing logic for born digital PDFs when using PDF.js. We also change
the works' indexing field to match the file sets' indexing field
(all_text_tsimv). We also "valyrized" the logic in the HykuIndexing
module to accomplish this.

Ref:

🎁 Add logic for snippets when splitting PDFs

b666d84

This commit will add logic to add the ability to see search snippets
with PDFs that were split through IIIF Print.

✅ Add test for file set indexer logic

9aa21fc

This commit will add a simple test for the FileSetIndexer logic to check
that the text extraction from a born digital pdf works as expected.

Screenshots / Video

With a query term

image

Without a query term

image

This commit will introduce the Hyku::Indexers::FileSetIndexer to add
indexing logic for born digital PDFs when using PDF.js.  We also change
the works' indexing field to match the file sets' indexing field
(all_text_tsimv).  We also "valyrized" the logic in the HykuIndexing
module to accomplish this.

Ref:
- scientist-softserv/adventist_knapsack#769
This commit will add logic to add the ability to see search snippets
with PDFs that were split through IIIF Print.
This commit will add a simple test for the FileSetIndexer logic to check
that the text extraction from a born digital pdf works as expected.
@kirkkwang kirkkwang added the minor-ver for release notes label Sep 18, 2024
ShanaLMoore
ShanaLMoore previously approved these changes Sep 18, 2024
app/indexers/concerns/hyku_indexing.rb Outdated Show resolved Hide resolved
Copy link

github-actions bot commented Sep 18, 2024

Test Results

    3 files  ±0      3 suites  ±0   18m 49s ⏱️ +42s
2 036 tests +2  1 986 ✅ +2  50 💤 ±0  0 ❌ ±0 
2 063 runs  +2  2 011 ✅ +2  52 💤 ±0  0 ❌ ±0 

Results for commit 617bc24. ± Comparison against base commit 6ad34e1.

This pull request removes 42 and adds 44 tests. Note that renamed tests count towards both.
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to destroy 14c09e36-5359-4e9d-b2c4-6d34a714cbb0
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to edit ba33f5ab-5c7e-4cbb-99bd-d11813d2a0d3
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to read beab8792-9e5d-4a37-bd0f-46f62a677130
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to update 805e209f-c1c8-444c-abee-3048ed49dc9e
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to destroy b9855cfd-9e3c-464b-8762-f12501cac3d5
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to edit 561a1c3e-6aea-43c9-bda0-6d4beec2e5c2
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to read 7942a063-0cf6-47d3-a445-290e7eee4c8a
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to update e21406d3-332c-4ce3-a04a-4385294e9a5b
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor GenericWork permissions is expected not to be able to destroy e77e79cb-7a65-435c-8fba-6baa930713dc
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor GenericWork permissions is expected not to be able to edit 09f061df-610c-4ad7-bb87-177dc27571d0
…
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to destroy 2064566a-a9ce-41d3-a544-8f8e672458dd
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to edit eb3c0cb2-afa0-4d05-bf96-a12d6ec1797a
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to read c2491c9c-7602-42a7-aec6-ef494d6ea915
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor Etd permissions is expected not to be able to update 291883d1-96c2-4471-94f3-fb8e95457df4
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to destroy 9046aed0-6dcc-46ed-b567-5fa343ae55a7
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to edit 63e74de6-3471-42e0-8727-755d886f687e
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to read 4a825a76-4353-4799-9ea7-8bb8962fe756
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor FileSet permissions is expected not to be able to update 29df9972-8576-4a73-9b46-19d84da4bc18
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor GenericWork permissions is expected not to be able to destroy 92dc1161-07d1-4bc6-80c0-0b49d38f6976
spec.abilities.work_ability_spec ‑ Hyrax::Ability::WorkAbility when work depositor GenericWork permissions is expected not to be able to edit 5e3ffd12-fdf3-47d5-907b-e3a8490bbc06
…

♻️ This comment has been updated with latest results.

This commit will rename the #full_text method to #extract_full_text
because it was causing weird issues with super.
@ShanaLMoore ShanaLMoore merged commit dd59dbf into main Sep 19, 2024
8 checks passed
@ShanaLMoore ShanaLMoore deleted the i769-snippets branch September 19, 2024 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
minor-ver for release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants