You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are seeing large number of datasets stuck in db-solr-sync. As of today the count is 10757.
Nightly db-solr-sync job resolves discrepancy in packages' harvest_object property between DB and SOLR. Due to some glitches in network and other factors, it is normal to see single digits of packages that need to be synced every day after harvesting. After db-solr-sync job is done, the count should be 0.
A large number of stuck (not resolved by db-solr-sync job) datasets means there are other issues with tose datasets that need to be identified and resolved by other process, after which we should see 0 count of stuck datasets.
Datasets with no harvest object in the DB:
Non datajson harvest type can be viewed via this API call. All can be viewed from db-solr-sync log.
db-solr-sync will not fix them. solution: skip them in db-solr-sync; purge them via api call or other fix.
After above fix the number should drop from 10-20K to miminal. If not, we can further identify causes and eventually get the number down to double digits.
After multiple db-solr-sync runs and cleanups, recurring datasets are gone.
total 373590 solr indexed_package
0 packages need to be removed from Solr
0 packages need to be updated/added to Solr
0 packages without harvest_object need to be mannually deleted
We will monitor the daily output as O&M task and there are minimal stuck datasets.
We are seeing large number of datasets stuck in
db-solr-sync
. As of today the count is 10757.Nightly
db-solr-sync
job resolves discrepancy in packages'harvest_object
property between DB and SOLR. Due to some glitches in network and other factors, it is normal to see single digits of packages that need to be synced every day after harvesting. Afterdb-solr-sync
job is done, the count should be 0.A large number of stuck (not resolved by
db-solr-sync
job) datasets means there are other issues with tose datasets that need to be identified and resolved by other process, after which we should see 0 count of stuck datasets.Related to GSA/catalog.data.gov#848
The text was updated successfully, but these errors were encountered: