Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dictionary KeyError using ASE OTF training #242

Closed
aaronchen0316 opened this issue Sep 15, 2020 · 2 comments
Closed

dictionary KeyError using ASE OTF training #242

aaronchen0316 opened this issue Sep 15, 2020 · 2 comments

Comments

@aaronchen0316
Copy link
Contributor

aaronchen0316 commented Sep 15, 2020

Hi I encountered an error while using ASE_OTF() to train my system.

flare_calculator = FLARE_Calculator(gp_model,
                                    par = True,
                                    mgp_model = None,
                                    use_mapping = False)
cube.set_calculator(flare_calculator)
otf = ASE_OTF(cube,
                   timestep = 1 * units.fs,
                   number_of_steps = 5000,
                   dft_calc = dft_calc,
                   md_engine = md_engine,
                   md_kwargs = md_kwargs,
                   **otf_params)

I copied the error messages below

Process Process-415:
Traceback (most recent call last):
  File "/path_to_python/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/path_to_python/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/path_to_code/code/my_flare/flare/gp_algebra.py", line 21, in queue_wrapper
    result_queue.put((wid, func(*args)))
  File "/path_to_code/code/my_flare/flare/gp_algebra.py", line 1179, in get_ky_and_hyp_pack
    training_data = _global_training_data[name]
KeyError: 'default_gp' (or whatever name='' I specified)

I don't know what causes this error because some training can run without any problems but some will encounter this KeyError. Also, this error didn't occur until a few steps in the training.
My best guess is that I am running my script in parallel on a shared computation node (with 16-24 cpus), so each subprocess has problem communicating?

Thank you in advance!

@YuuuXie
Copy link
Collaborator

YuuuXie commented Sep 15, 2020

Could you please tell which version of flare are you using? I think it should have been addressed in #223

@aaronchen0316
Copy link
Contributor Author

Could you please tell which version of flare are you using? I think it should have been addressed in #223

My bad. My version is #218. Will check with the later version. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants