-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File inventory job error #1263
Comments
The Rails log shows that there is an Here is snippet of the log:
I am not sure why it it soooooooo slow. When I ran the export it finished in a few minutes, but Matt's job it's been running for 20+ minutes by now. |
As a side note, there is a surprisingly large number of SQL SELECT related to the flipflop feature on the log as this job is running. This might explain why the original error points to a PostgreSQL time out:
Almost 200,000 SQL statements:
These requests might be hitting the cache rather than the DB, not sure about this. Yet, the number is way too large even if it is hitting the cache. |
It looks like TWO requests are on-going for the file list. The log output shows one at Of the requests finished at 12:24 PM but none of the records in the Waiting to see what happens when the second job finishes (it has exported 119,000 file names) As of right now there are two files generated on disk, I think the one a
|
Is this the error you are seeing @hectorcorrea https://github.com/pulibrary/pdc_describe/blob/main/app/jobs/dspace_file_copy_job.rb#L36-L39 |
It looks like when the jobs are finishing but they are failing when saving the update on the database (even thought the file list was saved successfully to disk). And as @carolyncole noted then Sidekiq considers that a failed job and retries it. |
I manually killed the two jobs in Sidekiq QA for this file inventory so that Sidekiq stops retrying them. |
HoneyBadger reported an error when running the file inventory job in the background:
The error information points to
user_id 186
andproject_id 44
.https://tigerdata-qa.princeton.edu/projects/44
Matt confirmed that this is indeed the project that he reported during the check-in meeting today and that they job has not finished (it looks like it crashed.)
Honeybadger issue: https://app.honeybadger.io/projects/113559/faults/115546694
The original error was dated
1/30/2025 at 9:48 AM -05:00
I think this was when Matt ran the job. There is another error reported at1/30/2025 at 10:53 AM -05:00
.I manually re-ran the job right before
11:20 AM
and I don't see an error on Honeybadger and it looks like it generated the list. When I click "Download Complete List" it says it was generated 3 minutes ago.Matt kick off the process again at
11:30 AM
to see if the error was a hiccup or a permissions issue. Almost 20 minutes later and Matt's job has not finished but there is no HoneyBadger error about it either. This is what the job looks like in the database:The text was updated successfully, but these errors were encountered: