Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for sparse OfflineLB maps that may be used with and LB configuration #2074

Closed
lifflander opened this issue Jan 23, 2023 · 2 comments · Fixed by #2145
Closed

Allow for sparse OfflineLB maps that may be used with and LB configuration #2074

lifflander opened this issue Jan 23, 2023 · 2 comments · Fixed by #2145
Assignees

Comments

@lifflander
Copy link
Collaborator

What Needs to be Done?

Currently, the LB data input files must be dense and fully specified for all phases.

  • Extend the LBDataRestartReader to allow for sparse maps. (We need a proper specification for a phase that is the identical and thus left out.)
  • Add consideration for deleted/inserted elements
  • Right now, if vt_lb_name is not set to OfflineLB, the LBDataRestartReader is not created. We should check the LB configuration file to see if any line has OfflineLB (in addition). If it is specified anywhere we should construct the LBDataRestartReader. As another possible optimization, we only need the phases that are specified in the file to be read by the reader and those distributions stored.
@nlslatt
Copy link
Collaborator

nlslatt commented Apr 6, 2023

@lifflander I think the existing state of the code requires data for both the phase before (matching the pre-LB object distribution) and the phase after load balancing happens (for where the objects should be migrated). Are you suggesting a refactor that will allow migrating the objects to match the post-LB distribution without data specifying the starting distribution? I want to make sure the LB data file I give Arek with my LB config file will have the correct content.

Would the specification you mentioned appear in the LB config file? What about something like 100 OfflineLB phase=1, which would run OfflineLB on phase 100, making it match the object distribution from phase 1 of the input file? The user would probably have to avoid patterns like %100 OfflineLB when they have sparse data unless the most recent phase is automatically used.

@lifflander
Copy link
Collaborator Author

After some discussion with @nlslatt, we are going to take this a little different direction to make the LB data file more clearly specified so it is not ambiguous. I'm opening a new issue to implement first: #2130 which will allow users to specify when phases are skipped or identical to the previous phase.

cwschilly pushed a commit that referenced this issue Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants