Skip to content

Conversation

@dreamer2368
Copy link
Collaborator

@dreamer2368 dreamer2368 commented May 30, 2024

When ROM workflow is collecting restart files, orbital extrapolation is only created and not deleted after use, which causes a memory allocation problem. fixing this now.

@jeanlucf22
Copy link
Collaborator

The fix is to actually not create orbital extrapolation_ since it is not used at all! Remove the code "if (flag_extrapolated_data){ ...}"

@dreamer2368
Copy link
Collaborator Author

The fix is to actually not create orbital extrapolation_ since it is not used at all! Remove the code "if (flag_extrapolated_data){ ...}"

At least with carbyne example, I see flag_extrapolated_data turned on. Also, whether it is used in carbyne example or not, shouldn't we support the case where the extrapolation is used?

When flag_extrapolated_data is true, orbital_extrapol_ is created with OrbitalsExtrapolationFactory<OrbitalsType>::create in line 729 of src/md.cc. This is exactly following the loading procedure taken in the md workflow.

@jeanlucf22
Copy link
Collaborator

The fix is to actually not create orbital extrapolation_ since it is not used at all! Remove the code "if (flag_extrapolated_data){ ...}"

At least with carbyne example, I see flag_extrapolated_data turned on. Also, whether it is used in carbyne example or not, shouldn't we support the case where the extrapolation is used?

When flag_extrapolated_data is true, orbital_extrapol_ is created with OrbitalsExtrapolationFactory<OrbitalsType>::create in line 729 of src/md.cc. This is exactly following the loading procedure taken in the md workflow.

Indeed, but if you delete the object you created right after, why create it in the first place? Creating it is necessary for extrapolation, but then should be deleted later. On top of that, I don't see how we would use extrapolation with ROM...

@dreamer2368
Copy link
Collaborator Author

Indeed, but if you delete the object you created right after, why create it in the first place?

This is following the loading procedure in MGmol<OrbitalsType>::md. If the extrapolation is used, a different dataset is read from the restart file. In terms of ROM basis training, this means the training data is different from what FOM actually returns. The difference may be small, but probably not negligible (down to machine precision). Although, this difference impacts our ROM verification process, where we run a 100% reproductive case and check the solution apple-to-apple. Bottom line is that we want to use the exact snapshots FOM saves/uses.

Creating it is necessary for extrapolation, but then should be deleted later. On top of that, I don't see how we would use extrapolation with ROM...

This routine is added to release branch, so it can be used outside of ROM as well. If we're excluding the extrapolation with ROM in future, then we could just remove this part, and enforce to not use extrapolation throughout the ROM (and FOM snapshot generation).

@jeanlucf22
Copy link
Collaborator

As it is in this PR, the code between if(){...} does the following: creates an object, set it up, then delete it. It is not affecting which dataset is read.

@dreamer2368
Copy link
Collaborator Author

As far as I understand, OrbitalsExtrapolation<OrbitalsType>::setupPreviousOrbitals set up a new orbital by reading a dataset ExtrapolatedFunction, not the usual Function, is this correct?

@jeanlucf22
Copy link
Collaborator

Well, it is correct. But we need to discuss what you want with it.
"Extrapolated orbitals" are actually initial guesses for Phi for the next MD step, so they are not converged solutions for any given atomic configuration. They are needed only to ensure a smooth restart in MD, mimicking what would happen without a restart. They have no physical meaning, and are not solutions of any PDE. It may be that down the road we will use these in some way. But as it is, I feel like they are only making the code more complicated for no good reason.

In addition, if you really wanted to use them for extrapolation, you would need to keep that object orbital_extrapol_ and use it later, not destroy it.

@dreamer2368
Copy link
Collaborator Author

Removed the part that uses the extrapolation and enforced to not use the extrapolation.

@jeanlucf22 jeanlucf22 merged commit 61276db into release Jun 3, 2024
@jeanlucf22 jeanlucf22 deleted the pod-fix branch June 3, 2024 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants