Open
Description
Expected behaviour
- No crash in MCPCient
- If prometheus disabled in MCPclient, we don't need to refresh tmp prometheus database (because it doesn't exist)
Current behaviour
Sometimes MCPclient tries to refresh the prometheus database (I suppose when worker pool member is restarted because reaching the limit (new AM1.16 feature: pool of workers))
It is happening at the transfer start. And it is not happening in all transfers.
MCPServer log error:
DEBUG 2024-10-16 13:00:07 archivematica.mcp.server.packages:packages:_copy_from_transfer_sources:243: source: 6ff352d0-f495-4330-920a-dcbe20a817b5.tar.gz, destination: /var/archivematica/sharedDirectory/tmp/tmp1gkokc0b/6ff352d0-f495-4330-920a-dcbe20a817b5.tar.gz
DEBUG 2024-10-16 13:00:11 archivematica.mcp.server.packages:packages:_start_package_transfer_with_auto_approval:424: Package cbf9eee3-b327-4bb9-a2a3-915438751536: moving package to processing directory
DEBUG 2024-10-16 13:00:11 archivematica.mcp.server.packages:packages:_start_package_transfer_with_auto_approval:429: Package cbf9eee3-b327-4bb9-a2a3-915438751536: starting workflow processing
DEBUG 2024-10-16 13:00:11 archivematica.mcp.server.jobs.chain:chain:__init__:89: Creating JobChain 167dc382-4ab1-4051-8e22-e7f1c1bf3e6f for package cbf9eee3-b327-4bb9-a2a3-915438751536 (initial link 3229e01f-adf3-4294-85f7-4acb01b3fbcf)
INFO 2024-10-16 13:00:11 archivematica.mcp.server.jobs.client:client:run:104: Running Extract zipped bag transfer (package cbf9eee3-b327-4bb9-a2a3-915438751536)
DEBUG 2024-10-16 13:00:11 archivematica.mcp.server.jobs.tasks:gearman_backend:submit:217: Submitted gearman job d92fafcd-0b61-4ac1-93af-3e714db63a75 (extractzippedtransfer_v0.0)
ERROR 2024-10-16 13:00:11 archivematica.mcp.server.jobs.tasks:gearman_backend:update_task_results:238: Gearman task batch d92fafcd-0b61-4ac1-93af-3e714db63a75 failed to execute
INFO 2024-10-16 13:00:11 archivematica.mcp.server.jobs.client:client:run:104: Running Email fail report (package cbf9eee3-b327-4bb9-a2a3-915438751536)
DEBUG 2024-10-16 13:00:11 archivematica.mcp.server.jobs.tasks:gearman_backend:submit:217: Submitted gearman job f19c544a-6b26-421c-bb3b-8552ec6c4cd6 (emailfailreport_v0.0)
ERROR 2024-10-16 13:00:11 archivematica.mcp.server.jobs.tasks:gearman_backend:update_task_results:238: Gearman task batch f19c544a-6b26-421c-bb3b-8552ec6c4cd6 failed to execute
INFO 2024-10-16 13:00:11 archivematica.mcp.server.jobs.client:client:run:104: Running Move to the failed directory (package cbf9eee3-b327-4bb9-a2a3-915438751536)
DEBUG 2024-10-16 13:00:11 archivematica.mcp.server.jobs.tasks:gearman_backend:submit:217: Submitted gearman job 664f634a-9034-4a54-b85e-91438aae55d3 (movetransfer_v0.0)
ERROR 2024-10-16 13:00:11 archivematica.mcp.server.jobs.tasks:gearman_backend:update_task_results:238: Gearman task batch 664f634a-9034-4a54-b85e-91438aae55d3 failed to execute
MCPCLient log in syslog:
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: archivematicaClient.py: DEBUG 2024-10-16 13:00:11,421 archivematica.mcp.client.gearman:handle_job:95: Gearman job request
d92fafcd-0b61-4ac1-93af-3e714db63a75 received for extractzippedtransfer_v0.0
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: archivematicaClient.py: ERROR 2024-10-16 13:00:11,424 archivematica.mcp.client.gearman:on_job_exception:111: An unhandled
exception occurred processing a Gearman job
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: Traceback (most recent call last):
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/gearman/worker.py", line 218, in on_job_ex
ecute
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: job_result = function_callback(self, current_job)
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/opt/archivematica/archivematica/src/MCPClient/lib/client/gearman.py", line 101, in handle_job
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: with metrics.task_execution_time_histogram.labels(script_name=task_name).time():
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/metrics.py", line 203, i
n labels
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self._metrics[labelvalues] = self.__class__(
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/metrics.py", line 600, i
n __init__
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: super().__init__(
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/metrics.py", line 151, i
n __init__
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self._metric_init()
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/metrics.py", line 628, i
n _metric_init
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self._sum = values.ValueClass(self._type, self._name, self._name + '_sum', self._labelnames, self._labelvalues, self._documen
tation)
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/values.py", line 68, in __init__
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self.__reset()
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/values.py", line 82, in __reset
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: files[file_prefix] = MmapedDict(filename)
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/mmap_dict.py", line 64, in __init__
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self._f = open(filename, 'rb' if read_mode else 'a+b')
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/prometheus-stats7rrndq4_/histogram_1782.db'
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: archivematicaClient.py: DEBUG 2024-10-16 13:00:11,461 archivematica.mcp.client.gearman:handle_job:95: Gearman job request
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: archivematicaClient.py: ERROR 2024-10-16 13:00:11,462 archivematica.mcp.client.gearman:on_job_exception:111: An unhandled exception occurred processing a Gearman job
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: Traceback (most recent call last):
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/gearman/worker.py", line 218, in on_job_execute
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: job_result = function_callback(self, current_job)
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/opt/archivematica/archivematica/src/MCPClient/lib/client/gearman.py", line 101, in handle_job
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: with metrics.task_execution_time_histogram.labels(script_name=task_name).time():
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/metrics.py", line 203, in labels
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self._metrics[labelvalues] = self.__class__(
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/metrics.py", line 600, in __init__
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: super().__init__(
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/metrics.py", line 151, in __init__
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self._metric_init()
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/metrics.py", line 628, in _metric_init
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self._sum = values.ValueClass(self._type, self._name, self._name + '_sum', self._labelnames, self._labelvalues, self._documentation)
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/values.py", line 68, in __init__
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self.__reset()
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/values.py", line 82, in __reset
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: files[file_prefix] = MmapedDict(filename)
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/mmap_dict.py", line 64, in __init__
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self._f = open(filename, 'rb' if read_mode else 'a+b')
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/prometheus-stats7rrndq4_/histogram_1782.db'
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: archivematicaClient.py: DEBUG 2024-10-16 13:00:11,489 archivematica.mcp.client.gearman:handle_job:95: Gearman job request 664f634a-9034-4a54-b85e-91438aae55d3 received for movetransfer_v0.0
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: archivematicaClient.py: ERROR 2024-10-16 13:00:11,490 archivematica.mcp.client.gearman:on_job_exception:111: An unhandled exception occurred processing a Gearman job
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: Traceback (most recent call last):
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/gearman/worker.py", line 218, in on_job_execute
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: job_result = function_callback(self, current_job)
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/opt/archivematica/archivematica/src/MCPClient/lib/client/gearman.py", line 101, in handle_job
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: with metrics.task_execution_time_histogram.labels(script_name=task_name).time():
6 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/gearman/worker.py", line 218, in on_job_execute
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: job_result = function_callback(self, current_job)
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/opt/archivematica/archivematica/src/MCPClient/lib/client/gearman.py", line 101, in handle_job
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: with metrics.task_execution_time_histogram.labels(script_name=task_name).time():
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/metrics.py", line 203, in labels
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self._metrics[labelvalues] = self.__class__(
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/metrics.py", line 600, in __init__
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: super().__init__(
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/metrics.py", line 151, in __init__
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self._metric_init()
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/metrics.py", line 628, in _metric_init
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self._sum = values.ValueClass(self._type, self._name, self._name + '_sum', self._labelnames, self._labelvalues, self._documentation)
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/values.py", line 68, in __init__
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self.__reset()
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/values.py", line 82, in __reset
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: files[file_prefix] = MmapedDict(filename)
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: File "/usr/share/archivematica/virtualenvs/archivematica/lib/python3.9/site-packages/prometheus_client/mmap_dict.py", line 64, in __init__
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: self._f = open(filename, 'rb' if read_mode else 'a+b')
Oct 16 13:00:11 RACAM-RH-DEVMPL python[1782]: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/prometheus-stats7rrndq4_/histogram_1782.db'
Steps to reproduce
It isn't easy to reproduce, it only happened 1 time on each 2 system pipelines. So I suspect it only happens when restarting a worker.
Your environment (version of Archivematica, operating system, other relevant details)
Rocky 9, AM1.16
SS separated VM
2 pipelines
The issue happened in both pipelines (Identical config)
For Artefactual use:
Before you close this issue, you must check off the following:
- All pull requests related to this issue are properly linked
- All pull requests related to this issue have been merged
- A testing plan for this issue has been implemented and passed (testing plan information should be included in the issue body or comments)
- Documentation regarding this issue has been written and merged (if applicable)
- Details about this issue have been added to the release notes (if applicable)