Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add preflight OS, CPU, RAM, Swap, and Filesystem checks (backport #326) #329

Open
wants to merge 1 commit into
base: squid
Choose a base branch
from

Conversation

mergify[bot]
Copy link

@mergify mergify bot commented Feb 25, 2025

  • Implemented OS, NIC and Other preflight checks to validate system requirements before Ceph cluster creation.

    • Checks include:
      • OS version (RHEL 9+ required)
      • SELinux enforcing mode
      • Firewalld installation and status
      • Required package availability (rpcbind, podman, firewalld)
      • Podman version check (>= 3.3)
      • RHEL software profile validation
      • Tuned profile check
      • CPU, RAM, Swap, and Filesystem (part of other checks)
      • Check whether jumbo frames are enabled
      • Is it configured with DHCP or static IP
      • Is the bandwidth sufficient
      • Collect and output current NIC options set (e.g. Bonding, not bridged or virtual)
      • Check and report network latency (ping) with all hosts provided in the inventory file
      • Separate NICs for front-end and back-end networks

Enhancements:

❯ ansible-playbook -i ~/ansible-inventory/inventory.ini cephadm-preflight.yml                                                                                                                                                              ─╯

PLAY [insecure_registries] *******************************************************************************************************************************************************************************************************************

TASK [fail if insecure_registry is undefined] ************************************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

PLAY [preflight] *****************************************************************************************************************************************************************************************************************************

TASK [fail when ceph_origin is custom with no repository defined] ****************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

TASK [fail if baseurl is not defined for ceph_custom_repositories] ***************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

PLAY [all] ***********************************************************************************************************************************************************************************************************************************

❯ ansible-playbook -i ~/ansible-inventory/inventory.ini cephadm-preflight.yml                                                                                                                                                              ─╯

PLAY [insecure_registries] *******************************************************************************************************************************************************************************************************************

TASK [fail if insecure_registry is undefined] ************************************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

PLAY [preflight] *****************************************************************************************************************************************************************************************************************************

TASK [fail when ceph_origin is custom with no repository defined] ****************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

TASK [fail if baseurl is not defined for ceph_custom_repositories] ***************************************************************************************************************************************************************************
skipping: [rhel-ceph-admin]

PLAY [Preflight Checks for Ceph Deployment] **************************************************************************************************************************************************************************************************

TASK [Initialize preflight results list] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Collect installed package facts] *******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Check if OS is RHEL 9+] ****************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Ensure SELinux is set to Enforcing mode] ***********************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine SELinux Check Result] ********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine SELinux Failure Reason] ******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine Package Installation Check Result] *******************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine Package Installation Failure Reason] *****************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Fetch Firewalld status] ****************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Extract Podman version if installed] ***************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Determine if Podman meets version requirement (>=3.3)] *********************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Validate RHEL software profile] ********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define RHEL Profile Check Result] ******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define RHEL Profile Check Reason] ******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Get current tuned profile] *************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define Tuned Profile Check Result] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define Tuned Profile Check Reason] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Check CPU x86-64-v2 support] ***********************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Define CPU, RAM, Swap, and Filesystem Check Variables] *********************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Ping all hosts in inventory to measure latency] ****************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin] => (item=rhel-ceph-admin)

TASK [Define networking facts] ***************************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Store all preflight check results] *****************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Generate preflight check report file] **************************************************************************************************************************************************************************************************
changed: [rhel-ceph-admin -> localhost]

TASK [Load the preflight check report] *******************************************************************************************************************************************************************************************************
ok: [rhel-ceph-admin]

TASK [Final Check - Fail if any critical checks failed] **************************************************************************************************************************************************************************************
fatal: [rhel-ceph-admin]: FAILED! => changed=false 
  msg: 'Preflight checks failed for the following: Tuned Profile, RHEL Profile, Minimum RAM, Swap Space, /var Partition, Root Filesystem, Jumbo Frames Enabled, NIC Static IP Configuration, NIC Bandwidth. Please resolve these issues before proceeding.'

PLAY RECAP ***********************************************************************************************************************************************************************************************************************************
rhel-ceph-admin            : ok=25   changed=1    unreachable=0    failed=1    skipped=3    rescued=0    ignored=0  

=================================================================================================

❯ cat preflight_report.txt                                                                                                                                                                                                                 ─╯
==================================================
               **  Preflight Check Report **
==================================================

 System Checks
--------------------------------------------------
- OS Version: ✅ Passed

- Tuned Profile: ❌ Failed
    - Reason: Incorrect tuned profile. Expected: throughput-performance

- RHEL Profile: ❌ Failed
    - Reason: Incorrect RHEL software profile. Expected: Server with File and Storage Server.

- Firewalld Running: ✅ Passed

- Podman Installed: ✅ Passed

- SELinux: ✅ Passed

- Required Packages Installed: ✅ Passed

- Minimum RAM (8GB): ❌ Failed
    - Reason: System has only 7684 MB RAM, required: 8192MB

- Swap Space (1.5x RAM): ❌ Failed
    - Reason: System has only 5119 MB Swap, required: 11526 MB

- CPU x86-64-v2: ✅ Passed

- CPU Cores >= 4: ✅ Passed

- /var is a separate partition: ❌ Failed
    - Reason: /var is not a separate partition

- Root Filesystem >= 100GB: ❌ Failed
    - Reason: Root FS is only 43GB, required: 100GB

- NIC Configuration: ℹ️ INFO
    - Reason: Available network interfaces: ens3 | Speeds (Mbps): -1

- Jumbo Frames Enabled: ❌ Failed
    - Reason: MTU is 1500, recommended > 1500

- NIC Static IP Configuration: ❌ Failed
    - Reason: NIC is using DHCP, static IP is recommended

- NIC Bandwidth (10GbE Recommended): ❌ Failed
    - Reason: NIC speed is -1 Mbps, recommended is 10GbE

- Network Latency: ℹ️ INFO
    - Reason: Average latency (ms): ['0.111']

==================================================
** Summary **
--------------------------------------------------
❌ Critical Failures Detected:
   - Tuned Profile, RHEL Profile, Minimum RAM, Swap Space, /var Partition, Root Filesystem, Jumbo Frames Enabled, NIC Static IP Configuration, NIC Bandwidth

** Action Required: Please resolve these issues before proceeding.

❯ pwd                                                                                                                                                                                                                                      ─╯
/home/kushaldeb/Github/cephadm-ansible/reports

░▒▓ ~/Github/cephadm-ansible/reports  on implement_os_preflight_checks *1 

❯ ls -l                                                                                                                                                                                                                                    ─╯
total 4
-rw-r--r--. 1 kushaldeb kushaldeb 1872 Feb 24 22:04 rhel-ceph-admin_preflight_report.txt



This is an automatic backport of pull request #326 done by [Mergify](https://mergify.com).

Copy link
Author

mergify bot commented Feb 25, 2025

Cherry-pick of f4833f4 has failed:

On branch mergify/bp/squid/pr-326
Your branch is up to date with 'origin/squid'.

You are currently cherry-picking commit f4833f4.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   ceph_defaults/defaults/main.yml
	new file:   checks.yml
	new file:   rhel-checks.yml
	new file:   templates/preflight_report.j2

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   cephadm-preflight.yml

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@mergify mergify bot added the conflicts label Feb 25, 2025
@guits guits force-pushed the mergify/bp/squid/pr-326 branch from 94bfcc3 to 331269a Compare February 25, 2025 16:17
@guits guits removed the conflicts label Feb 25, 2025
- Implemented OS preflight checks to validate system requirements before Ceph cluster creation.
- Checks include:
  - OS version (RHEL 9+ required)
  - SELinux enforcing mode
  - Firewalld installation and status
  - Required package availability (rpcbind, podman, firewalld)
  - Podman version check (>= 3.3)
  - RHEL software profile validation
  - Tuned profile check
  - CPU, RAM, Swap, and Filesystem (part of other checks)
  - Check whether jumbo frames are enabled
  - Is it configured with DHCP or static IP
  - Is the bandwidth sufficient
  - Collect and output current NIC options set (e.g. Bonding, not bridged or virtual)
  - Check and report network latency (ping) with all hosts provided in the inventory file
  - Listing all NICs

Signed-off-by: Kushal Deb <Kushal.Deb@ibm.com>
(cherry picked from commit f4833f4)
@guits guits force-pushed the mergify/bp/squid/pr-326 branch from 331269a to c07eb50 Compare February 25, 2025 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants