-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOIs for files (PIDs for files) #4350
Conversation
… and File UNF on file pg. [ref #2438]
@@ -1605,6 +1606,16 @@ private String init(boolean initFull) { | |||
JH.addMessage(FacesMessage.SEVERITY_WARN, BundleUtil.getStringFromBundle("file.rsyncUpload.inProgressMessage.summary"), | |||
BundleUtil.getStringFromBundle("file.rsyncUpload.inProgressMessage.details")); | |||
} | |||
//This is a hack to remove dataset locks for File PID registration if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(@sekmiller The comment below is the one I was referring to on Friday, I left it as a review on Thu. night; but it looks like it was sitting as "pending" until today. But I did talk about the stuff below, with you and Gustavo, on Fri.)
I'm a little worried about the fact that this is necessary. I.e., why is the lock not being removed properly in the FinalizeDatasetPublicationCommand. (I'm assuming the finalize command did succeed, and all the global ids for the files have been registered - ?). Are there any exceptions in the server log coming from the command, that could explain what happens there? Interestingly, it looks like inside the command the lock is being removed not once, but twice: by a command, and then by calling the service bean:
ctxt.engine().submit(new RemoveLockCommand(getRequest(), theDataset, DatasetLock.Reason.pidRegister));
ctxt.datasets().removeDatasetLocks(theDataset.getId(), DatasetLock.Reason.pidRegister);
could this be causing a conflict of some kind?
I don't think this is super important; but I'm a little worried if this is a sign of something else in the command not functioning correctly. So maybe it's worth spending a little time investigating.
I'll try to build the branch and run a couple of tests with large numbers of files; and see if I get this condition and/or see anything in the logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that the FinalizeDatasetPublication does succeed in registering the files, but doesn't successfully remove the lock in the case of a database error that seems to occur when there are many files being registered or updated (more than 500.) I described it in a Slack message: In the file DOI branch we've seen a database connection error when the number of files on a dataset is well over 500. The logs show errors like: Severe: RAR5031:System Exception
java.lang.NullPointerException
at com.sun.enterprise.resource.ConnectorXAResource.getResourceHandle(ConnectorXAResource.java:246)
at com.sun.enterprise.resource.ConnectorXAResource.end(ConnectorXAResource.java:159)
at com.sun.enterprise.transaction.JavaEETransactionManagerSimplified.delistResource(JavaEETransactionManagerSimplified.java:527)
If you wait long enough the dataset will eventually publish and all of the files will be registered, however, the page is never locked to the user but at the end of the process the dataset gets it lock and it doesn't go away.
I put a hack into the Datapage init such that if a dataset is locked for PIDRegistration but there's no draft version, we try to remove the lock.
Is there a better way to approach this?
I couldn't find any way to handle the System Exception and I put in the hacky remove lock to hopefully prevent support issues.
All of this is pretty edge-case stuff, but if you had any thoughts I'd appreciate it.
return new PublishDatasetResult(ctxt.em().merge(theDataset), true); | ||
} | ||
} | ||
|
||
|
||
@Asynchronous |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this method have the @Asynchronous
attribute?
It's not void; and the it is already calling the callFinalizePublishCommandAsynchronously() method in the dataset service, that is aynchronous.
I don't assume this attribute is what's causing the weird state condition with 500+ files; but at least it should be very safe to remove, since it doesn't serve any purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove it.
@@ -104,6 +104,8 @@ public Dataset execute(CommandContext ctxt) throws CommandException { | |||
|
|||
// Remove locks | |||
ctxt.engine().submit(new RemoveLockCommand(getRequest(), theDataset, DatasetLock.Reason.Workflow)); | |||
ctxt.engine().submit(new RemoveLockCommand(getRequest(), theDataset, DatasetLock.Reason.pidRegister)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a specific need to attempt to remove the pidRegister lock TWICE? - by the RemoveLockCommand, and then by the direct call to ctxt.datasets().removeDatasetLocks() in the line below?
Could this be causing some kind of a conflict?
I would leave the latter and remove the former maybe...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not. It was another attempt to remove the lock in the face of the database error. I will remove it,
…n the dataset once it's locked, before the asynchronous part of the publishing process is initiated. (#2438)
…atasetPublicationCommand update the dataset object; to avoid the condition when the first command is still holding onto the same object as it initiate the execution of the second command, asynchronously - which can result in concurrency and optimistic lock issues. Also, removed the redundant/unnecessary merge() calls. (#2438)
# Conflicts: # doc/sphinx-guides/source/api/native-api.rst # doc/sphinx-guides/source/installation/config.rst # src/main/java/Bundle.properties # src/main/java/edu/harvard/iq/dataverse/DvObject.java # src/main/java/edu/harvard/iq/dataverse/HandlenetServiceBean.java # src/main/java/edu/harvard/iq/dataverse/api/AbstractApiBean.java
as of 0de93c0 ` Tests run: 69, Failures: 1, Errors: 0, Skipped: 0 `Test set: edu.harvard.iq.dataverse.api.SwordITTests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 37.767 sec <<< FAILURE! |
New Contributors
Welcome! New contributors should at least glance at CONTRIBUTING.md, especially the section on pull requests where we encourage you to reach out to other developers before you start coding. Also, please note that we measure code coverage and prefer you write unit tests. Pull requests can still be reviewed without tests or completion of the checklist outlined below. Thanks!
Related Issues
Pull Request Checklist