Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDOSWorkChain excepts due to daemon restart #724

Closed
t-reents opened this issue Sep 10, 2021 · 5 comments
Closed

PDOSWorkChain excepts due to daemon restart #724

t-reents opened this issue Sep 10, 2021 · 5 comments

Comments

@t-reents
Copy link
Contributor

t-reents commented Sep 10, 2021

Hi,

I encountered a problem in the PDOSWorkChain which I think is related to the following issue from aiida-core aiidateam/aiida-core#5124. The error occurs when the workchain is uploading the DOS and PDOS calculations and I need to restart the daemon or increase/decrease the number of workers at the moment.

The error message is as follows:

2021-09-10 10:42:09 [16716 | REPORT]: [40300|PdosWorkChain|run_pdos_parallel]: launching ProjwfcCalculation<40328>
2021-09-10 10:43:59 [16717 |  ERROR]: Traceback (most recent call last):
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/aiida/engine/persistence.py", line 124, in load_checkpoint
    bundle = serialize.deserialize(checkpoint)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/aiida/orm/utils/serialize.py", line 230, in deserialize
    return yaml.load(serialized, Loader=AiiDALoader)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/__init__.py", line 114, in load
    return loader.get_single_data()
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 43, in get_single_data
    return self.construct_document(node)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 47, in construct_document
    data = self.construct_object(node)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 92, in construct_object
    data = constructor(self, node)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/aiida/orm/utils/serialize.py", line 156, in bundle_constructor
    yaml_node = loader.construct_mapping(bundle)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 210, in construct_mapping
    return super().construct_mapping(node, deep=deep)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 135, in construct_mapping
    value = self.construct_object(value_node, deep=deep)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 92, in construct_object
    data = constructor(self, node)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/aiida/orm/utils/serialize.py", line 131, in mapping_constructor
    yaml_node = loader.construct_mapping(mapping, deep=True)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 210, in construct_mapping
    return super().construct_mapping(node, deep=deep)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 135, in construct_mapping
    value = self.construct_object(value_node, deep=deep)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 94, in construct_object
    data = constructor(self, tag_suffix, node)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 624, in construct_python_object_apply
    instance = self.make_python_instance(suffix, node, args, kwds, newobj)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 568, in make_python_instance
    raise ConstructorError("while constructing a Python instance", node.start_mark,
yaml.constructor.ConstructorError: while constructing a Python instance
expected a class, but found <class 'builtin_function_or_method'>
  in "<unicode string>", line 26, column 14:
      nscf_emax: !!python/object/apply:numpy.core ...
                 ^

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/aiida/manage/external/rmq.py", line 208, in _continue
    result = await super()._continue(communicator, pid, nowait, tag)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/plumpy/process_comms.py", line 606, in _continue
    saved_state = self._persister.load_checkpoint(pid, tag)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/aiida/engine/persistence.py", line 126, in load_checkpoint
    raise PersistenceError(f'Failed to load the checkpoint for process<{pid}>: {traceback.format_exc()}')
plumpy.exceptions.PersistenceError: Failed to load the checkpoint for process<40300>: Traceback (most recent call last):
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/aiida/engine/persistence.py", line 124, in load_checkpoint
    bundle = serialize.deserialize(checkpoint)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/aiida/orm/utils/serialize.py", line 230, in deserialize
    return yaml.load(serialized, Loader=AiiDALoader)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/__init__.py", line 114, in load
    return loader.get_single_data()
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 43, in get_single_data
    return self.construct_document(node)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 47, in construct_document
    data = self.construct_object(node)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 92, in construct_object
    data = constructor(self, node)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/aiida/orm/utils/serialize.py", line 156, in bundle_constructor
    yaml_node = loader.construct_mapping(bundle)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 210, in construct_mapping
    return super().construct_mapping(node, deep=deep)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 135, in construct_mapping
    value = self.construct_object(value_node, deep=deep)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 92, in construct_object
    data = constructor(self, node)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/aiida/orm/utils/serialize.py", line 131, in mapping_constructor
    yaml_node = loader.construct_mapping(mapping, deep=True)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 210, in construct_mapping
    return super().construct_mapping(node, deep=deep)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 135, in construct_mapping
    value = self.construct_object(value_node, deep=deep)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 94, in construct_object
    data = constructor(self, tag_suffix, node)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 624, in construct_python_object_apply
    instance = self.make_python_instance(suffix, node, args, kwds, newobj)
  File "/home/treents/.venvs/aiida/lib/python3.8/site-packages/yaml/constructor.py", line 568, in make_python_instance
    raise ConstructorError("while constructing a Python instance", node.start_mark,
yaml.constructor.ConstructorError: while constructing a Python instance
expected a class, but found <class 'builtin_function_or_method'>
  in "<unicode string>", line 26, column 14:
      nscf_emax: !!python/object/apply:numpy.core ...
@sphuber
Copy link
Contributor

sphuber commented Sep 10, 2021

This is indeed a problem with numpy objects being stored in the context which cannot be deserialized by default, which is what happens when the daemon is restarted or new workers are added. However, we added compatibility for this in aiida-core==1.6.5 ( see this PR). Are you on an older version perhaps? If so, maybe try upgrading and trying again. I think that should solve the issue

@t-reents
Copy link
Contributor Author

Thanks for the quick reply @sphuber. I am using version 1.6.4, so I will try to upgrade it and check it again.

@sphuber
Copy link
Contributor

sphuber commented Sep 13, 2021

Thanks for the quick reply @sphuber. I am using version 1.6.4, so I will try to upgrade it and check it again.

Great. Did you get a chance to try it and did it work?

@t-reents
Copy link
Contributor Author

Sorry for the late reply @sphuber. I checked it out the last days and it works!

@sphuber
Copy link
Contributor

sphuber commented Sep 17, 2021

Great, thanks a lot. Then I will close this issue.

@sphuber sphuber closed this as completed Sep 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants