Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor v0.2.0: Read if you want to use the Flux Operator! #211

Closed
vsoch opened this issue Dec 10, 2023 · 0 comments · Fixed by #208
Closed

Refactor v0.2.0: Read if you want to use the Flux Operator! #211

vsoch opened this issue Dec 10, 2023 · 0 comments · Fixed by #208

Comments

@vsoch
Copy link
Member

vsoch commented Dec 10, 2023

We have a WIP pull request that will make several improvements to the Flux Operator, and if you are using for the first time we recommend you use this version. The reason we aren't merging is to await a paper (that details notes from the previous version).

Usage

x86

To install the refactor branch, it comes with it's own manifest (and automated build) that reference a container built for it. You can simply do:

kubectl apply -f https://raw.githubusercontent.com/flux-framework/flux-operator/test-refactor-modular/examples/dist/flux-operator-refactor.yaml

ARM

We also have an arm image ready for you.

kubectl apply -f https://raw.githubusercontent.com/flux-framework/flux-operator/test-refactor-modular/examples/dist/flux-operator-refactor-arm.yaml

Please ping us here if there are any issues.

Improvements

Improvements in this update include:

  • A new modular design that does not require flux to be installed in the application container. Instead we use init Containers to add flux as a spack view on the fly. This means it is setup, added to an emptyDir (to be shared by the application container) and goes away.
  • Improvements to connecting to a flux broker. Previously, you had to shell in, figure out the location of the socket, and then issue a long / confusing command to it to connect Now you source a file, and then simply connect to an environment variable for the socket (see example below).
  • The code organization is improved, with operator (kubernetes logic) going under controllers/flux and unrelated flux logic under pkg/flux. Previously everything was under controllers/flux.
  • All examples are refactored to use the default namespace, omitting the need to create the flux-operator namespace.
  • It added un-needed complexity for the flux operator to attempt to create volumes for the user. Instead, we now just accept existing volumes (of really any type) and provide examples with pv.yaml / pvc.yaml for all previous examples that warranted volumes.
  • Flux Restful is no longer tangled with the operator, but is provided as an example (where the logic is separate)
  • For functionality like the flux operator v1alpha1 (with flux packaged alongside the container) you can set flux->container->disable to true, and this will tell the flux operator to not expect having the view (and the flux in your container will be found). This is recommended for ARM and similar applications (which spack views cannot work for, see should arm builds work? spack/spack#41708).
  • Affinity rules are allowed to be disabled by the user
  • The initContainer (that has flux) also has the ability to have a resource spec, this way pods that need to have Guaranteed QoS (e.g., for assigning more than one pod/node) can get it. An example is added for this.
  • Flux is being run as root. It added a lot of complexity (e.g., the munge key and a lot of customization to go back to root to setup volumes) and I (vsoch) decided that this early in development, we weren't getting any benefits of making life harder for ourselves by way of doing a "proper" HPC cluster that runs as the flux user. The container was starting as root anyway, and giving the running user power to run commands, so I'm not sure we lost much. The interactions / setup is much simpler now.
  • Support to request that when a worker is killed, it completes. This is intended for a controlled, downsize case.
source /mnt/flux/flux-view.sh 
flux proxy $fluxsocket bash
flux resource list

This issue will be updated with further changes.

@vsoch vsoch pinned this issue Dec 10, 2023
@vsoch vsoch changed the title Flux Operator Refactor v0.2.0: Read if you want to use the Operator! Refactor v0.2.0: Read if you want to use the Operator! Dec 10, 2023
@vsoch vsoch changed the title Refactor v0.2.0: Read if you want to use the Operator! Refactor v0.2.0: Read if you want to use the Flux Operator! Dec 10, 2023
@vsoch vsoch linked a pull request Dec 10, 2023 that will close this issue
@vsoch vsoch unpinned this issue Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant