-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add preflight OS, CPU, RAM, Swap, and Filesystem checks #326
base: devel
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, avoid as much as possible using ignore_errors: true
5d002a4
to
280a9cf
Compare
280a9cf
to
2a6cb0f
Compare
2a6cb0f
to
e837ef9
Compare
e837ef9
to
6e47331
Compare
6e47331
to
9546e44
Compare
9546e44
to
39a250e
Compare
jenkins test el9-functional |
2 similar comments
jenkins test el9-functional |
jenkins test el9-functional |
@Kushal-deb Please consider using a single - name: Store all check results
set_fact:
preflight_results: >-
{{ preflight_results + [
{'Check': 'OS Version', 'Result': os_check, 'Reason': os_reason},
{'Check': 'Tuned Profile', 'Result': tuned_profile_check, 'Reason': tuned_profile_reason},
{'Check': 'RHEL Profile', 'Result': rhel_profile_check, 'Reason': rhel_profile_reason},
{'Check': 'Firewalld Running', 'Result': firewalld_check, 'Reason': firewalld_reason},
{'Check': 'Podman Installed', 'Result': podman_check, 'Reason': podman_reason},
{'Check': 'SELinux', 'Result': selinux_check, 'Reason': selinux_reason},
{'Check': 'Minimum RAM (8GB)', 'Result': memory_checks['ram']['result'], 'Reason': memory_checks['ram']['reason']},
{'Check': 'Swap Space (1.5x RAM)', 'Result': memory_checks['swap']['result'], 'Reason': memory_checks['swap']['reason']},
{'Check': 'CPU x86-64-v2', 'Result': cpu_checks['x86_64_v2']['result'], 'Reason': cpu_checks['x86_64_v2']['reason']},
{'Check': 'CPU Cores >= 4', 'Result': cpu_checks['cores']['result'], 'Reason': cpu_checks['cores']['reason']},
{'Check': '/var is a separate partition', 'Result': filesystem_checks['var_partition']['result'], 'Reason': filesystem_checks['var_partition']['reason']},
{'Check': 'Root Filesystem >= 100GB', 'Result': filesystem_checks['root_fs']['result'], 'Reason': filesystem_checks['root_fs']['reason']},
{'Check': 'SELinux', 'Result': selinux_check, 'Reason': selinux_reason},
{'Check': 'Jumbo Frames Enabled', 'Result': jumbo_frames_check, 'Reason': jumbo_frames_reason},
{'Check': 'Network Latency', 'Result': 'INFO', 'Reason': 'Latency results: ' ~ ping_results.results | map(attribute='ping') | list},
{'Check': 'NIC Static IP Configuration', 'Result': nic_config_check, 'Reason': nic_config_reason},
{'Check': 'NIC Bandwidth (10GbE Recommended)', 'Result': nic_speed_check, 'Reason': nic_speed_reason},
] }}
preflight_failures: >-
{{ preflight_failures
+ (['OS Version'] if os_check == 'FAIL' else [])
+ (['Tuned Profile'] if tuned_profile_check == 'FAIL' else [])
+ (['RHEL Profile'] if rhel_profile_check == 'FAIL' else [])
+ (['SELinux'] if selinux_check == 'FAIL' else [])
+ (['Firewalld Running'] if firewalld_check == 'FAIL' else [])
+ (['Podman Installed'] if not podman_installed else [])
+ (['Minimum RAM'] if memory_checks['ram']['result'] == 'FAIL' else [])
+ (['Swap Space'] if memory_checks['swap']['result'] == 'FAIL' else [])
+ (['CPU x86-64-v2'] if cpu_checks['x86_64_v2']['result'] == 'FAIL' else [])
+ (['CPU Cores'] if cpu_checks['cores']['result'] == 'FAIL' else [])
+ (['/var Partition'] if filesystem_checks['var_partition']['result'] == 'FAIL' else [])
+ (['Root Filesystem'] if filesystem_checks['root_fs']['result'] == 'FAIL' else [])
+ (['SELinux'] if selinux_check == 'FAIL' else []) }} |
39a250e
to
3157d41
Compare
4feee23
to
11764b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kushal-deb what happens if you have more than 1 node in your inventory host ?
At the end of the playbook, the report seems to be generated only once, for the first node only.
11764b3
to
3fef793
Compare
Hi, I have updated the implementation to generate the result files on the controller node under reports directory. example:
|
- Implemented OS preflight checks to validate system requirements before Ceph cluster creation. - Checks include: - OS version (RHEL 9+ required) - SELinux enforcing mode - Firewalld installation and status - Required package availability (rpcbind, podman, firewalld) - Podman version check (>= 3.3) - RHEL software profile validation - Tuned profile check - CPU, RAM, Swap, and Filesystem (part of other checks) - Check whether jumbo frames are enabled - Is it configured with DHCP or static IP - Is the bandwidth sufficient - Collect and output current NIC options set (e.g. Bonding, not bridged or virtual) - Check and report network latency (ping) with all hosts provided in the inventory file - Listing all NICs Signed-off-by: Kushal Deb <Kushal.Deb@ibm.com>
3fef793
to
1fa20b2
Compare
Implemented OS, NIC and Other preflight checks to validate system requirements before Ceph cluster creation.
Enhancements:
https://tracker.ceph.com/issues/69726
https://tracker.ceph.com/issues/69781
https://tracker.ceph.com/issues/69843
Logs: