Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleted primary bitstream still referenced in the bundle #9099

Closed
paulo-graca opened this issue Oct 2, 2023 · 6 comments · Fixed by #9106
Closed

Deleted primary bitstream still referenced in the bundle #9099

paulo-graca opened this issue Oct 2, 2023 · 6 comments · Fixed by #9106
Assignees
Labels
Milestone

Comments

@paulo-graca
Copy link
Contributor

Describe the bug
Primary Bitstreams are used for Access Status calculations, but when the that bitstream is deleted, the bundle primary bitstream reference isn't changed. So, Access Status calculations will return wrong status.

To Reproduce
Steps to reproduce the behavior:

  1. Create a new item
  2. Upload several files into the item and fill the required data
  3. Deposit the item
  4. As an admin, Edit that Item and promote one bitstream to be Primary Bitstream (this will reference that bitstream at the bundle level)
  5. Please take note of that bitstream's UUID
  6. Delete that bitstream that we promote to primary
  7. Check the database (please consider your UUID here):
SELECT * FROM bundle WHERE primary_bitstream_id='814aefff-4874-4a97-a7aa-b654172e7a40'

Expected behavior
Apparently nothing changes, but the deleted bitstream is still referenced as that bundle primary bitstream. It's hoped that, when a bitstream is deleted, also that bundle's primary bitstream reference is also removed. This issue is also present in older DSpace versions (like 5.X).

@paulo-graca paulo-graca added bug needs triage New issue needs triage and/or scheduling labels Oct 2, 2023
@github-project-automation github-project-automation bot moved this to 🆕 Triage in DSpace Backlog Oct 2, 2023
@paulo-graca paulo-graca self-assigned this Oct 2, 2023
@tdonohue tdonohue added code task Code cleanup task medium priority labels Oct 2, 2023
@tdonohue
Copy link
Member

tdonohue commented Oct 2, 2023

This definitely sounds like a bug. Pulling this over to the 7.6.x maintenance board. Thanks for volunteering to look into it, @paulo-graca !

@abollini
Copy link
Member

abollini commented Oct 5, 2023

This would require a database change imho as we should have a foreign key in the bundle table to force the primary_bitstream_id column to reference an existing row in the bitstream table. Moreover we need a sql script to cleanup inconsistent data before to apply to constraint. This seems to force us to target this fix for DSpace 8.

If we want to fix it in the 7.x maintenance branch we should avoid the database change and perform this extra step in the dspace 8 fix, do you agree @tdonohue ?

@tdonohue
Copy link
Member

tdonohue commented Oct 5, 2023

@abollini : If we made this fix via a database change, then I agree that would need to wait for 8.0. We want to avoid DB changes in bug fix releases unless absolutely required.

However, I see that @paulo-graca has a draft PR that doesn't involve a database change. Instead, he just unsets the primary bitstream if deleted: #9106 That seems like it could qualify for a bug fix release, as it's fixing code behavior (and it seems to be a possible one line fix -- although it's unclear if the proposed fix works yet)

@paulo-graca
Copy link
Contributor Author

Sorry, I'm still working on it. I've had mini vacations last week (it was a National holiday here). I didn't had the time to make more progresses.
I would do that (I was planning to do that). We are also interested in cleaning this issue from current records. I will create a migration script/sql to fix old referenced deleted bitstreams and we will also enforce the relation with that constraint.

Aside: @abollini Initially when I start reading your comment I was thinking you were suggesting a different thing, like to handle primary bitstream in another way. And that make me think that we could have primary bitstreams at bundle2bitstream (boolean field) that could eliminate this issue, but it has some pros and cons and would require a lot more effort and that would definitely be marked as 8.0.

@toniprieto
Copy link
Contributor

I want to mention that this issue is related to this long-standing issue: #7348

The foreign key in the bundle table already exists but when a bitstream is deleted it is actually a logical delete (they are marked as deleted). The table row is actually deleted afterwards, using the cleanup script and this could produce the exception reported in #7348

Fixing this bug will surely fix #7348

@paulo-graca In case it helps, investigating #7348 I have seen that there are two functions to mark the bitstreams as deleted, one in the BundleService and another in the BitstreamService. The primary bitstream relation is cleared if the BundleService implementation is used but calling directly the BitstreamService implementation the primary bitstream relation is not cleared. It could be the problem, I think that the API call uses directly the BistreamService implementation. Could be a good idea to clear this relation also in the BitstreamService implementation.

https://github.com/DSpace/DSpace/blob/main/dspace-api/src/main/java/org/dspace/content/BundleServiceImpl.java#L214-L254

https://github.com/DSpace/DSpace/blob/main/dspace-api/src/main/java/org/dspace/content/BitstreamServiceImpl.java#L261-L287

https://github.com/DSpace/DSpace/blob/main/dspace-server-webapp/src/main/java/org/dspace/app/rest/repository/patch/operation/BitstreamRemoveOperation.java#L58

@paulo-graca
Copy link
Contributor Author

@toniprieto thank you for your feedback!

I can confirm that currently we already have that constrain:

ALTER TABLE "public"."bundle"
	ADD CONSTRAINT "bundle_primary_bitstream_id_fkey" FOREIGN KEY ( "primary_bitstream_id" )
	REFERENCES "public"."bitstream" ( "uuid" ) MATCH SIMPLE
	ON DELETE No Action
	ON UPDATE No Action;

@github-project-automation github-project-automation bot moved this from 🏗 In Progress to ✅ Done in DSpace 8.x and 7.6.x Maintenance Oct 31, 2023
alanorth added a commit that referenced this issue Oct 31, 2023
Clear primary bistream when it's deleted- Fix issue #9099
alanorth added a commit that referenced this issue Oct 31, 2023
[Port dspace-7_x] Clear primary bistream when it's deleted- Fix issue #9099
@tdonohue tdonohue added this to the 7.6.1 milestone Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging a pull request may close this issue.

4 participants