-
Notifications
You must be signed in to change notification settings - Fork 495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset: Loading dataset page fails with exception if original file size is null, eg. some legacy Murray #7205
Comments
The reason this is happening on the test box where the above was observed is that it does not have the credentials to access the production S3 bucket where the files are stored. So it can't look up the size of the stored original (which hasn't been cached in the db yet for some files in this dataset). The failure is a bit too fatal/catastrophic - but this would not be an issue on a properly configured system either. |
@landreev so the legacy aspect is definitely not an issue then? I can add the credentials to confirm. |
Should be easier (definitely safer!) to simulate the "legacy" case with a QA S3 bucket (by deleting the saved original). I would be a little weary about configuring any test boxes with the prod. credentials. |
duplicates #6343 |
Just making a reminder note to re-test the file page as well. Since #6343, now closed, seems to suggest that it was file.xhtml page that was throwing a 500 w/ no S3 access; or at least it says "file pages". Worth to double-check. |
My guess on that - the file page tries to show the 'friendlysize' - if size is null and gets sent to FileSizeChecker.bytesToHumanReadable() that may trigger a 500 even though the file.xhtml page is trying to do an empty check (jsf:rendered="#{!(empty FilePage.file.friendlySize)}) - if it checked the raw size for the rendering test, it could work (and a similar test could fix the dataset page where friendlysizes are displayed in the filesFragment). |
I am admittedly tired and stupid r/n, but I don't think that value, from which the "friendlysize" is calculated, should ever be coming from the S3 metadata... It should always be the value of "filesize" column from the DataFile db table - no? (there may entirely be something else on that page that can be bombing w/ no storage access; not disputing that possibility... some direct compute url silliness? - anyway, we'll retest and figure it out).
|
Yes - there are two parts. The pages all have problems if the datafile table has a null size. One way to get that is/was with an S3 direct upload if S3 is slow/flaky such that the upload succeeds but the subsequent call to S3 to get the metadata for the new object fails (us-east1 is only eventually consistent so under load (or DDOS) you may not get metadata to start, even though the upload itself succeeds (and you have a successful response from S3 to prove that)). I think since we saw this problem (and with Kevin's testing of very small files) we've added some delays and retries to make this less likely. If there are any other current/historic ways to get a null filesize , they would have the same symptoms. (As with S3, temp files and file stores populate the datafile table once at Datafile creation so if there is a file system problem at that point I guess you could get a null. In any case, if it isn't/wasn't set at creation, a null filesize is then persistent until corrected.) |
So we are talking about a situation where a (temporary) loss of access S3 during the creation/upload of the file left the files in a messed-up state (with null sizes in the datafile table - correct? It's a bit of a different situation from what this issue was opened for. |
(I updated the main description of the issue, with the info we learned when helping UVA with this same issue recently) |
[Leonid]
Updating the issue, since we had some incorrect assumptions back when it was opened (L.A.):
/api/admin/datafiles/integrity/fixmissingoriginalsizes
. We recommended (but not required!) to do this in 4.10 release; but some installations may have missed it.fixmissingoriginalsizes
API, not to leave any nulls in the db, but replace them with -1 in cases where the originals are missing. That's what I did in our prod. manually. (and then maybe we could remind people with pre-4.10 legacy installations to run that API)All of the above should be very straightforward.
[Kevin] Original comment
It was discovered that loading the dataset page fails with an exception when original file size is null and cannot be calculated. This happened with a Murray dataset on a test instance. It can happen if 1. a legacy file does not have an original file 2. the file system is temporarily inaccessible and so the get size function fails.
It would seem that this should not be a fatal error in loading the dataset. After consulting with Dev, it seems that this behavior was added when the down all files in a dataset feature was added because it wants to alert you to the size of download.
From @qqmyers in #6343:
When Amazon had a recent S3 outage (DDOS attack), I noted that Dataverse can no longer display file pages when S3 is down. Previously, the only issue with S3 being out related to operations with file upload/download and the Dataverse site itself worked. This issue now is fairly trivial - it looks like the file size to be displayed on the page is being pulled from S3 metadata and hence, if S3 is down, the page breaks/shows a 500 error. I'm not sure its worth fixing, but avoiding 500 errors could be as simple as doing a null check on the file size. (In any case, this issue might help someone debug - 500 errors related to file sizes could be due to S3 being misconfigured/down,)
The text was updated successfully, but these errors were encountered: