Skip to content

For installing and configuring SLURM - Simple Linux Utility for Resource Management

License

Notifications You must be signed in to change notification settings

fgci-org/ansible-role-slurm

Repository files navigation

ansible-role-slurm

Build Status

Creates a SLURM cluster

Tested with these Linux distributions:

  • CentOS 7
  • Ubuntu
    • 18.04 (client only)

The role goes to some lengths to be backwards compatible.

For example in case we want to add default settings to slurm.conf in only some slurm versions we can do that by adding slurm_conf_version_specific_params_list to the version in vars/slurm_version.yml

Dependencies

How-To

Initial playbook configuration

Playbook variables:

All variables should be defined in defaults/main.yml

All nodes run munge. Nodes which are part of the slurm_compute host group will additionally run slurmd. Nodes which are part of the slurm_service host group will additionally runs slurmctld and slurmdbd (unless {{ slurm_accounting_storage_host }} is not the same as {{ slurm_service_node }}). Nodes which are in neither of these two host groups are assumed to be submit hosts.

You also need to add a mysql_slurm_password: "PASSWORD" string somewhere. This will be used to set a password for the slurm mysql user. See http://docs.ansible.com/ansible/playbooks_vault.html

To add your own nodes and queues define the slurm_nodelist and slurm_partitionlist lists.

You don't have to setup a SQL server with this role, it's here for convenience and if you want to run the SQL on the same node as the slurmctld and slurmdbd. It is possible to run the slurmdbd on a different host than the slurmctld by changing the slurm_accounting_storage_host variable.

It is also possible to setup a backup slurm controller by defining slurm_backup_controller variable. Please read the SLURM HA documentation. For example you'll need a shared directory (for example NFS) available on both the slurm_service_node and slurm_backup_controller.

Implementation

A playbook that uses this role: https://github.com/fgci-org/fgci-ansible

Or you can check out the tests/test.yml in this repo.

Example Playbook

    - hosts: compute
      strategy: free
      roles:
         - { role: ansible-role-pam }
         - { role: ansible-role-nhc }
         - { role: ansible-role-slurm }

    - hosts: install
      roles:
         - { role: ansible-role-slurm }

Known Issues

Testing and contributions

Testing is done with Travis.

  • PRs to master
  • if possible make sure that the new feature is also tested
  • strive for backwards compatibility

Authors / Contributors: