From 1af57e0e2c6dd7a7f5d11753a1cdf661750f8f3d Mon Sep 17 00:00:00 2001 From: Meike Chabowski Date: Wed, 29 Nov 2023 18:04:44 +0100 Subject: [PATCH] Implemented enhancements from doc review Fixed wording, typos, punctuation etc. following documentation style guide. --- adoc/SLES4SAP-HANAonKVM-15SP4.adoc | 197 ++++++++++++++--------------- 1 file changed, 98 insertions(+), 99 deletions(-) diff --git a/adoc/SLES4SAP-HANAonKVM-15SP4.adoc b/adoc/SLES4SAP-HANAonKVM-15SP4.adoc index 313339c5..b02d3aa3 100644 --- a/adoc/SLES4SAP-HANAonKVM-15SP4.adoc +++ b/adoc/SLES4SAP-HANAonKVM-15SP4.adoc @@ -74,13 +74,13 @@ Scale-Out:: For an SAP HANA Scale-Out deployment, distributed over multiple VMs == Supported scenarios and prerequisites Follow the *{DocumentName} - {sles4sap} {slesProdVersion}* - document at hand which describes the steps necessary + document at hand. It describes the steps necessary to create a supported SAP HANA on KVM configuration. - The use of {sles4sap} is recommended in this scenario, however its also valid to use {sles} as both hypervisor and VM guest operating system. + The use of {sles4sap} is recommended in this scenario. However, it is also valid to use {sles} as both hypervisor and VM guest operating system. Inquiries about scenarios not listed here should be directed to mailto:saphana@suse.com[saphana@suse.com]. -Please be informed that the XML examples in this guide will deal with a Single-VM scenario. +Keep in mind that the XML examples in this guide will deal with a single-VM scenario. [[_sec_supported_scenarios]] === Supported scenarios @@ -152,7 +152,7 @@ It is however *not* required to dedicate CPUs to the hypervisor. [[_sec_memory_sizing]] ==== Memory sizing -Since SAP HANA runs inside the VM, it is the RAM size of the VM which needs to satisfy the memory requirements from the SAP HANA Memory sizing. +Since SAP HANA runs inside the VM, it is the RAM size of the VM which needs to satisfy the memory requirements of the SAP HANA Memory sizing. The memory used by the VM must be smaller than the physical memory of the machine. It is recommended to reserve at least 8% of the total memory reported by "`/proc/meminfo`" (in the "`MemTotal`" field) for the hypervisor. @@ -163,12 +163,12 @@ See <<_sec_memory_backing>> for more details. [[_sec_cpu_sizing]] ==== CPU sizing -Thorough test of the configuration for the required workload is highly recommended before a "`go live`", as workload tests have shown that certain scenarios can generate CPU overhead of up to 20%. +Workload tests have shown that certain scenarios can generate CPU overhead of up to 20%. Thus, thorough testing of the configuration for the required workload is highly recommended before a "`go-live`". There are two main ways to deal with CPU sizing from a sizing perspective: -1. Follow the fixed memory-to-core ratios for SAP HANA as defined by SAP -2. Follow the SAP HANA TDI "`Phase 5`" rules as defined by SAP +1. Follow the fixed memory-to-core ratios for SAP HANA as defined by SAP. +2. Follow the SAP HANA TDI "`Phase 5`" rules as defined by SAP. Both ways are described in the following sections. @@ -181,9 +181,9 @@ The relevant memory-to-core ratio required to size a VM can be easily calculated * Go to the https://www.sap.com/dmc/exp/2014-09-02-hana-hardware/enEN/#/solutions["SAP HANA Certified Hardware Directory"]. * Select the required SAP HANA Appliance and Appliance Type (for example CPU Architecture "Intel Cascade Lake SP" for Appliance Type "Scale-up: BWoH"). * Look for the largest certified RAM size for the number of CPU Sockets on the server (for example 6 TiB/6144 GiB on 4-Socket). -* Look up the number of cores per CPU of this CPU Architecture used in SAP HANA Appliances. The CPU model numbers are shown after clicking on the 'Read more' button in the details view of a selected SAP HANA Appliance. To get the amount of cores of a specific CPU model you can query the product databases of the respective manufacturers (for example https://ark.intel.com["Intel"]) -* Using the above values calculate the total number of cores on the certified Appliance by multiplying number of sockets by number of cores (for example 4x28=112). -* Now divide the Appliance RAM by the total number of cores (not hyperthreads) to give you the *memory-to-core* ratio (for example 3072 GiB/112 = approx. 28 GiB per core). +* Look up the number of cores per CPU of this CPU Architecture used in SAP HANA Appliances. The CPU model numbers are shown in the details view of a selected SAP HANA Appliance after clicking the 'Read more' button. To get the amount of cores of a specific CPU model, you can query the product databases of the respective manufacturers (for example https://ark.intel.com["Intel"]) +* Using the above values calculates the total number of cores on the certified appliance by multiplying the number of sockets by the number of cores (for example 4x28=112). +* Now divide the appliance RAM by the total number of cores (not hyperthreads) to get the *memory-to-core* ratio (for example 3072 GiB/112 = approx. 28 GiB per core). <<_sap_hana_core_to_memory_ratio_examples>> below has some current examples of SAP HANA memory-to-core ratios. @@ -205,25 +205,25 @@ The relevant memory-to-core ratio required to size a VM can be easily calculated From your memory requirement, calculate the RAM size the VM needs to be compliant with the appropriate memory-to-core ratio defined by SAP. -* To get the memory per socket, multiply the memory-to-core ratio by the number of cores (not threads) of a single socket in your host -* Divide the memory requirement by the memory per socket, and round the result up to the next full number, and multiply that number by the memory per socket again +* To get the memory per socket, multiply the memory-to-core ratio by the number of cores (not threads) of a single socket in your host. +* Divide the memory requirement by the memory per socket, and round the result up to the next full number. Multiply again that number by the memory per socket. .Calculation example ==== * From an S/4HANA sizing you get a memory requirement for SAP HANA of 2000 GiB. * Your CPUs have 28 cores per socket. The memory per socket is `28 cores * 54.86 GiB/core = 1536 GiB`. -* Divide your memory requirement `2000 GiB / 1536 GiB = 1.2987` and round this result up to 2. Then multiply `2 * 1536 GiB = 3072 GiB` -* 3072 GiB is now the memory size to use in the VM configuration as described in <<_sec_memory_backing>> -* On a machine that has four sockets and a total of 6 TiB of memory this VM would span over two sockets, leaving two sockets for another VM running SAP HANA with a similar workload or different VM's running other workloads +* Divide your memory requirement `2000 GiB / 1536 GiB = 1.2987` and round this result up to 2. Then multiply `2 * 1536 GiB = 3072 GiB`. +* 3072 GiB is now the memory size to use in the VM configuration as described in <<_sec_memory_backing>>. +* On a machine with four sockets and a total of 6 TiB of memory, this VM would span over two sockets. This would leave two sockets for another VM running SAP HANA with a similar workload or different VMs running other workloads. ==== ===== Following the SAP HANA TDI "Phase 5" rules ** SAP HANA TDI "Phase 5" rules allow customers to deviate from the above described SAP HANA memory-to-core sizing ratios in certain scenarios. -The KVM implementation however must still adhere to the *SUSE Best Practices for SAP HANA on KVM - {sles4sap} {slesProdVersion}* document at hand. -Details on SAP HANA TDI Phase 5 can be found in the following blog https://blogs.sap.com/2017/09/20/tdi-phase-5-new-opportunities-for-cost-optimization-of-sap-hana-hardware/["TDI Phase 5: New Opportunities for Cost Optimization of SAP HANA Hardware"] from SAP. -** Since SAP HANA TDI Phase 5 rules use SAPS based sizing, SUSE recommends applying the same overhead as measured with SAP HANA on KVM for the respective KVM Version/CPU Architecture. SAPS values for servers can be requested from the respective hardware vendor. +However, the KVM implementation must still adhere to the *SUSE Best Practices for SAP HANA on KVM* document at hand. +Details on SAP HANA TDI Phase 5 can be found in the blog article https://blogs.sap.com/2017/09/20/tdi-phase-5-new-opportunities-for-cost-optimization-of-sap-hana-hardware/["TDI Phase 5: New Opportunities for Cost Optimization of SAP HANA Hardware"] from SAP. +** Since SAP HANA TDI Phase 5 rules use SAPS-based sizing, SUSE recommends applying the same overhead as measured with SAP HANA on KVM for the respective KVM version/CPU architecture. SAPS values for servers can be requested from the respective hardware vendor. The following SAP HANA sizing documentation should also be useful: @@ -237,7 +237,7 @@ The following SAP HANA sizing documentation should also be useful: === Configuring the KVM hypervisor version The hypervisor must be configured according to the *SUSE Best Practices for SAP - HANA on KVM - {sles4sap} {slesProdVersion}* guide at hand and fulfill the following minimal requirements: + HANA on KVM" ({sles4sap} {slesProdVersion}) guide at hand. In addition, it must fulfill the following minimal requirements: * {sles4sap} {slesProdVersion} ("Unlimited Virtual Machines" subscription) ** kernel (Only major version 5.14, minimum package version 5.4.21.150400.24.46) @@ -275,7 +275,7 @@ The guest VM must: * comply with KVM limits as per https://documentation.suse.com/sles/15-SP4/single-html/SLES-virtualization/#virt-hypervisors-limits["SUSE Linux Enterprise Server 15 SP4 Hypervisor Limits"]. * fulfill the SAP HANA Hardware and Cloud Measurent Tools (HCMT) storage KPI's as per {launchpadnotes}2493172[SAP Note 2493172 "SAP HANA Hardware and Cloud Measurement Tools"]. Refer to <<_sec_storage>> for storage configuration details. -* be configured according to the *SUSE Best Practices for SAP HANA on KVM - {sles4sap} {slesProdVersion}* document at hand. +* be configured according to the *SUSE Best Practices for SAP HANA on KVM* ({sles4sap} {slesProdVersion}) document at hand. [[_sec_hypervisor]] @@ -288,7 +288,7 @@ The following sections describe how to set up and configure the hypervisor for a For details refer to section 6.4 "Installation of Virtualization Components" of the SUSE Virtualization Guide (https://documentation.suse.com/sles/15-SP4/single-html/SLES-virtualization/#cha-vt-installation) -This guide assumes that there is a bare installation of {sles} or {sles4sap} available on the system. For installation instructions on {sles} or {sles4sap} please refer to the https://documentation.suse.com/sles/15-SP4/html/SLES-all/article-installation.html["SUSE Linux Enterprise Server 15 SP4 Installation Quick Start Guide"] +This guide assumes that there is a bare installation of {sles} or {sles4sap} available on the system. For installation instructions on {sles} or {sles4sap}, refer to the https://documentation.suse.com/sles/15-SP4/html/SLES-all/article-installation.html["SUSE Linux Enterprise Server 15 SP4 Installation Quick Start Guide"] Install the KVM packages using the following Zypper patterns: @@ -296,7 +296,7 @@ Install the KVM packages using the following Zypper patterns: zypper in -t pattern kvm_server kvm_tools ---- -In addition, it is also useful to install the `lstopo` tool which is part of the `hwloc` package contained inside the *HPC Module* for SUSE Linux Enterprise Server. +In addition, it is also useful to install the `lstopo` tool which is part of the `hwloc` package available from the *HPC Module* for SUSE Linux Enterprise Server. [[_sec_configure_networking_on_hypervisor]] === Configuring networking on the hypervisor @@ -338,8 +338,8 @@ If such line is not present, it might be the case that SR-IOV needs to be explic ==== Preparing a Virtual Function (VF) for a guest VM After checking that the NIC is SR-IOV capable, the host and the guest VM should be configured to use one of the available Virtual Functions (VFs) as (one of) the guest VM's network device(s). -More information about SR-IOV as a technology and how to properly configure everything that is necessary for it to work well in the general case can be found in the SUSE Virtualization Guide for SUSE Linux Enterprise Server 15 SP4 (https://documentation.suse.com/sles/15-SP4/html/SLES-all/book-virtualization.html), -and specifically in section "Adding SR-IOV Devices" (https://documentation.suse.com/sles/15-SP4/html/SLES-all/cha-libvirt-config-virsh.html#sec-libvirt-config-io). +More information about SR-IOV as a technology and how to properly configure everything necessary for it to work well in the general case can be found in the https://documentation.suse.com/sles/15-SP4/html/SLES-all/book-virtualization.html[SUSE Virtualization Guide for SUSE Linux Enterprise Server 15 SP4]. +Specifically, have a look here at section "Adding SR-IOV Devices" (https://documentation.suse.com/sles/15-SP4/html/SLES-all/cha-libvirt-config-virsh.html#sec-libvirt-config-io). *Enabling PCI passthrough for the host kernel* @@ -402,15 +402,15 @@ echo 4 > /sys/class/net/eth10/device/sriov_numvfs === Configuring storage on the hypervisor As with compute resources, the storage used for running SAP HANA must also be SAP certified. -Therefore only the storage from SAP HANA Appliances or SAP HANA Certified Enterprise Storage (https://www.sap.com/dmc/exp/2014-09-02-hana-hardware/enEN/#/solutions?filters=v:deCertified;storage) is supported. -In all cases the SAP HANA storage configuration recommendations from the respective hardware vendor and the SAP HANA Storage Requirements for TDI (https://archive.sap.com/kmuuid2/70c8e423-c8aa-3210-3fae-e043f5c1ca92/SAP%20HANA%20TDI%20-%20Storage%20Requirements.pdf) should be followed. +Therefore, only the storage from SAP HANA Appliances or SAP HANA Certified Enterprise Storage (https://www.sap.com/dmc/exp/2014-09-02-hana-hardware/enEN/#/solutions?filters=v:deCertified;storage) is supported. +In all cases, the SAP HANA storage configuration recommendations from the respective hardware vendor and the https://archive.sap.com/kmuuid2/70c8e423-c8aa-3210-3fae-e043f5c1ca92/SAP%20HANA%20TDI%20-%20Storage%20Requirements.pdf [SAP HANA Storage Requirements for TDI] should be followed. There are three supported storage options to use for the SAP HANA database inside a VM and on top of a {sles} 15 SP4 hypervisor: Fibre Channel (FC) storage, Network Attached Storage (NAS) and local storage. ==== Network attached Storage The SAP HANA storage is attached via the NFSv4 protocol. In this case, nothing needs to be configured on the hypervisor. -Do make sure though that the VM has access to one or more dedicated 10 Gbit Ethernet interfaces for the network traffic to the network-attached storage. +However, make sure that the VM has access to one or more dedicated 10 Gbit Ethernet interfaces for the network traffic to the network-attached storage. ==== Fibre Channel storage @@ -430,8 +430,8 @@ ad:00.1 Fibre Channel: QLogic Corp. ISP2722-based 16/32Gb Fibre Channel to PCIe The HBAs that are assigned to the guest VM must not be in use on the host. -The remaining storage configuration details, such as how to add the disks and the HBA controllers to the guest VM configuration file, -and what to do with them from inside the guest VM itself, are available in <<_sec_storage>>. +The remaining storage configuration details are available in <<_sec_storage>>. This includes also information about how to add the disks and the HBA controllers to the guest VM configuration file, +and what to do with them from inside the guest VM itself. ==== Local Storage @@ -469,8 +469,8 @@ systemctl start tuned tuned-adm profile virtual-host ---- -The `tuned` daemon should now start automatically at boot time, and it should always load the `virtual-host` profile, so there is no need to add any of the above commands in any custom start-up script. -If in doubt, it is possible to check with the following command whether `tuned` is running and what the current profile is : +The `tuned` daemon should now start automatically at boot time. Also, it should always load the `virtual-host` profile, so there is no need to add any of the above commands in any custom start-up script. +If in doubt, check with the following command whether `tuned` is running and what the current profile is: ---- tuned-adm profile @@ -486,43 +486,42 @@ Current active profile: virtual-host [[_sec_verify_tuned_has_set_cpu_frequency_governor_and_performance_bias]] ===== Power management considerations -The CPU frequency governor should be set to *performance* to avoid latency issues because of ramping the CPU frequency up and down in response to changes in the system's load. -The selected `tuned` profile should have done this already, and with the following command, it is possible to verify that it actually did: - +Set the CPU frequency governor to *performance* to avoid latency issues because of ramping the CPU frequency up and down in response to changes in the system's load. +The selected `tuned` profile should have done this already. You can verify it with the following command: ---- cpupower -c all frequency-info ---- -The governor setting can be verified by looking at the *current policy*. +Verify the governor setting by looking at the *current policy*. -Additionally, the performance bias setting should also be set to 0 (performance). The performance bias setting can be verified with the following command: +Additionally, set the performance bias setting to 0 (performance). You can verify the performance bias setting with the following command: ---- cpupower -c all info ---- Modern processors also attempt to save power when they are idle, by switching to a lower power state. -Unfortunately this incurs latency when switching in and out of these states. +Unfortunately, this incurs latency when switching in and out of these states. -To avoid that, and achieve better and more consistent performance, the CPUs should not be allowed to switch those power saving modes (known as C-states) and stay in normal operation mode all the time. -Therefore it is recommended that only the state C0 is used. +To avoid that, and to achieve better and more consistent performance, the CPUs should not be allowed to switch those power saving modes (known as *C-states*) and should stay in normal operation mode all the time. +Therefore, it is recommended to only use the state *C0*. This can be enforced by adding the following parameters to the kernel boot command line: `intel_idle.max_cstate=0`. -To double check that only the desired C-states are actually available, the following command can be used: +To double-check that only the desired C-states are available, use the following command: ---- cpupower idle-info ---- -The idle state settings can be verified by looking at the line containing `Available idle states:`. +Verify the idle state settings by looking at the line containing `Available idle states:`. [[_sec_irqbalance]] ==== `irqbalance` The `irqbalance` service should be disabled because it can cause latency issues when the _/proc/irq/*_ files are read. -To disable `irqbalance` run the following command: +To disable `irqbalance`, run the following command: ---- systemctl stop irqbalance.service @@ -534,7 +533,7 @@ systemctl disable irqbalance.service ==== Kernel Samepage Merging (ksm) Kernel Samepage Merging (KSM, https://www.kernel.org/doc/html/latest/admin-guide/mm/ksm.html ) should be disabled. -The following command makes sure that it is tuned off and that any sharing and de-duplication activity that may have happened, in case it was enabled, is reverted: +The following command makes sure that it is turned off and that any sharing and de-duplication activity that may have happened is reverted in case it was enabled: ---- echo 2 > /sys/kernel/mm/ksm/run @@ -578,7 +577,7 @@ Recently, a class of side channel attacks exploiting the branch prediction and t On an affected CPU, these problems cannot be fixed, but their effect and their actual exploitability can be mitigated in software. However, this sometimes has a non-negligible impact on the performance. -For achieving the best possible security, the software mitigations for these vulnerabilities are being enabled (`mitigations=auto`) with the exceptions of those that deal with "Machine Check Error Avoidance on Page Size Change" (CVE-2018-12207, also known as "iTLB Multiht") and "TSX asynchronous abort" (CVE-2019-11135). +To achieve the best possible security, the software mitigations for these vulnerabilities are being enabled (`mitigations=auto`) with the exceptions of those that deal with "Machine Check Error Avoidance on Page Size Change" (CVE-2018-12207, also known as "iTLB Multiht") and "TSX asynchronous abort" (CVE-2019-11135). *Automatic NUMA balancing* ---- @@ -590,8 +589,8 @@ Automatic NUMA balancing can result in increased system latency and should there ---- kvm_intel.ple_gap=0 kvm_intel.ple_window=0 ---- -Pause Loop Exit (PLE) is a feature whereby a spinning guest CPU releases the physical CPU until a lock is free. -This is useful in cases where multiple virtual CPUs are using the same physical CPU but causes unnecessary delays when the system is not overcommitted. +*Pause Loop Exit* (PLE) is a feature whereby a spinning guest CPU releases the physical CPU until a lock is free. +This is useful in cases where multiple virtual CPUs are using the same physical CPU. But it causes unnecessary delays when the system is not overcommitted. *Transparent huge pages* ---- @@ -604,24 +603,24 @@ Disabling it will avoid `khugepaged` interfering with the virtual machine while ---- intel_idle.max_cstate=0 ---- -Optimal performance is achieved by limiting the processor to states C0 (normal running state) as any other state includes the possibility for the operating system to put certain cpu cores in a lower powered idle state and the 'wake-up' time can impact performance inside the virtual machine. +Optimal performance is achieved by limiting the processor to states *C0* (normal running state). Any other state includes the possibility for the operating system to put certain CPU cores in a lower-powered idle state and the 'wake-up' time can impact performance inside the virtual machine. *Huge pages* ---- default_hugepagesz=1G hugepagesz=1G hugepages= ---- -The use of 1 GiB huge pages is to reduce overhead and contention when the guest is updating its page tables. +The use of 1 GiB huge pages is meant to reduce overhead and contention when the guest is updating its page tables. This requires allocation of 1 GiB huge pages on the host. The number of pages to allocate depends on the memory size of the guest. -1 GiB pages are not pageable by the OS. Thus they always remain in RAM and therefore the `locked` definition in libvirt XML files is not required. +1 GiB pages are not pageable by the OS. Thus, they always remain in RAM and therefore, the `locked` definition in libvirt XML files is not required. It is also important to ensure the order of the huge page options. Specifically the `` option must be placed *after* the 1 GiB huge page size definitions. .Calculating value [NOTE] ==== -The value for `` should be calculated by taking the number GiB`'s of RAM minus approx. 8% for the hypervisor OS. +Calculate the value for `` by taking the number GiB`'s of RAM minus approximately 8% for the hypervisor OS. For example, 6 TiB RAM (6144 GiB) minus 8% are approximately 5650 huge pages. ==== @@ -636,13 +635,13 @@ On top of that, `iommu=pt` makes sure that you set up the devices for the best p ---- intremap=no_x2apic_optout ---- -Interrupt remapping allows the kernel to overwrite the interrupt remapping tables created by the BIOS or UEFI Firmware, to make sure that certain interrupts from peripheral devices are routed to a certain CPU. With the 'no_x2apic_optout' we make sure that this feature is always enabled. +Interrupt remapping allows the kernel to overwrite the interrupt remapping tables created by the BIOS or UEFI Firmware. This ensures that certain interrupts from peripheral devices are routed to a certain CPU. With the 'no_x2apic_optout' you make sure that this feature is always enabled. *Processor MMIO Stale Data Vulnerabilities* ---- mmio_stale_data=off ---- -Processor MMIO Stale Data Vulnerabilities are a set of vulnerabilities that can expose data to attackers in a very limited scope of environments. Processors of the type {cascadelake} are not affected by this vulnerability, therefore it is unnecessary to mitigate it in the environment at hand. For more information refer to the https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html[Intel Guidance for Security Issues on Intel Processors] and to the indicated https://docs.kernel.org/6.2/admin-guide/hw-vuln/processor_mmio_stale_data.html[Article in the official documentation of the Linux Kernel]. +Processor MMIO Stale Data Vulnerabilities are a set of vulnerabilities that can expose data to attackers in a very limited scope of environments. Processors of the type {cascadelake} are not affected by this vulnerability. Therefore, it is unnecessary to mitigate it in the environment at hand. For more information, refer to the https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html[Intel Guidance for Security Issues on Intel Processors] and to the indicated https://docs.kernel.org/6.2/admin-guide/hw-vuln/processor_mmio_stale_data.html[Article in the official documentation of the Linux Kernel]. [[_sec_guest_vm_xml_configuration]] @@ -650,11 +649,11 @@ Processor MMIO Stale Data Vulnerabilities are a set of vulnerabilities that can [NOTE] ==== -This section describes the creation of a single VM on a single system. For the configuration of multiple VM's most of the values regarding CPU count and/or memory amount need to be divided by the total socket count of the system and multiplied by the desired socket count of the single VM. Pay attention to the memory assignment. Try to assign memory that is located on the socket you are using for the VM. +This section describes the creation of a single VM on a single system. For the configuration of multiple VM's, most of the values regarding CPU count and/or memory amount need to be divided by the total socket count of the system and multiplied by the desired socket count of the single VM. Pay attention to the memory assignment. Try to assign memory that is located on the socket you are using for the VM. ==== This section describes the modifications required to the libvirt XML definition of the guest VM. -The libvirt XML may be edited using the following command: +You might edit the libvirt XML using the following command: ---- virsh edit @@ -663,7 +662,7 @@ virsh edit [[_sec_create_an_initial_guest_vm_xml]] === Creating an initial guest VM XML -Refer to section 10 "Guest Installation" of the SUSE Virtualization Guide (https://documentation.suse.com/sles/15-SP4/html/SLES-all/cha-kvm-inst.html). +Refer to section 10 "Guest Installation" of the https://documentation.suse.com/sles/15-SP4/html/SLES-all/cha-kvm-inst.html[SUSE Virtualization Guide]. [[_sec_global_vcpu_configuration]] === Configuring global vCPU @@ -671,7 +670,7 @@ Refer to section 10 "Guest Installation" of the SUSE Virtualization Guide (https The virtual CPU configuration of the VM guest should reflect the host CPU configuration as close as possible. There cannot be any overcommitting of memory or CPU resources. -The CPU model should be set to `host-passthrough`, and any `check` should be disabled. +Set the CPU model to `host-passthrough`, and disable any `check`. In addition, the `rdtscp`, `invtsc` and `x2apic` features are required. [[_sec_memory_backing]] @@ -683,13 +682,13 @@ This guarantees optimal performance for the guest VM. It is necessary that each NUMA cell of the guest VM have a whole number of huge pages assigned to them (that is, no fractions of huge pages). All the NUMA cells should also have the same number of huge pages assigned to them (that is, the guest VM memory configuration must be balanced). -Therefore the number of huge pages needs to be dividable by the number of NUMA cells. +Therefore, the number of huge pages needs to be dividable by the number of NUMA cells. -For example, if the host has 6339943304 KiB (that is, 6 TiB) of memory and we want to leave 91.75% of it to the hypervisor (see <<_sec_memory_sizing>>), and there are 4 NUMA cells, each NUMA cell will have the following number of huge pages: +For example, if the host has 6339943304 KiB (that is, 6 TiB) of memory and you want to leave 91.75% of it to the hypervisor (see <<_sec_memory_sizing>>), and there are 4 NUMA cells, each NUMA cell will have the following number of huge pages: * (6339943304 * (91.75/100)) / 1048576 / 4 = 1386 -This means that, in total, there will need to be the following number of huge pages: +This means that, in total, there needs to be the following number of huge pages: * 1386 * 4 = 5544 @@ -697,7 +696,7 @@ Such number must be passed to the host kernel command line parameter on boot (th Both the total amount of memory the guest VM should use and the fact that such memory must come from 1 GiB huge pages need to be specified in the guest VM configuration file. -It must also be ensured that the `memory` and the `currentMemory` element have the same value, to disable memory ballooning, which, if enabled, would cause unacceptable latency: +You must also ensure that the `memory` and the `currentMemory` element have the same value. This is to disable memory ballooning, which, if enabled, would cause unacceptable latency: ---- @@ -726,15 +725,15 @@ The memory unit can be set to GiB to ease the memory computations. It is important to map the host topology into the guest VM, as described below. This allows HANA to spread its own workload threads across many virtual CPUs and NUMA nodes. -For example, for a 4-socket system, with 28 cores per socket and hyperthreading enabled, the virtual CPU configuration for a single VM will also have 4 sockets, 28 cores, 2 threads. In a multi-VM scenario its also imperative that the hosts NUMA topology is maped to the VM's. Two VM's on a 4-socket systems would both reflect the NUMA topology of two of the hosts NUMA nodes/sockets. Four VM's on a 4-socket system would reflect the NUMA topology of a single NUMA node/socket. +For example, for a 4-socket system, with 28 cores per socket and hyperthreading enabled, the virtual CPU configuration for a single VM will also have 4 sockets, 28 cores, 2 threads. In a multi-VM scenario it is also imperative that the hosts NUMA topology is maped to the VMs. Two VMs on a 4-socket systems would both reflect the NUMA topology of two of the hosts NUMA nodes/sockets. Four VMs on a 4-socket system would reflect the NUMA topology of a single NUMA node/socket. Always make sure that, in the guest VM configuration file: * the `cpu` `mode` attribute is set to `host-passthrough`. * the `cpu` `topology` attribute describes the vCPU NUMA topology of the guest, as discussed above. * the attributes of the `numa` elements describe which vCPU number ranges belong to which NUMA cell. Care should be taken since these number ranges are not the same as on the host. Additionally: -** the `cell` elements describe how much RAM should be distributed per NUMA node. In this 4-node example enter 25% (or 1/4) of the entire guest VM memory. -Also refer to <<_sec_memory_backing>> and <<_sec_memory_sizing>> of this paper for further details. +** the `cell` elements describe how much RAM should be distributed per NUMA node. In this 4-node example, enter 25% (or 1/4) of the entire guest VM memory. +Also refer to <<_sec_memory_backing>> and <<_sec_memory_sizing>> of the document at hand for further details. ** each NUMA cell of the guest VM has 56 vCPUs. ** the distances between the cells are identical to those of the physical hardware (as per the output of the command `numactl --hardware`). @@ -785,7 +784,7 @@ Also refer to <<_sec_memory_backing>> and <<_sec_memory_sizing>> of this paper f ---- -It is also necessary to pin virtual CPUs to physical CPUs, to limit the overhead caused by virtual CPUs being moved around physical CPUs by the host scheduler. +It is also necessary to pin virtual CPUs to physical CPUs. This limits the overhead caused by virtual CPUs being moved around physical CPUs by the host scheduler. Similarly, the memory for each NUMA cell of the guest VM must be allocated only on the corresponding host NUMA node. Note that KVM/QEMU uses a static hyperthread sibling CPU APIC ID assignment for virtual CPUs, irrespective of the actual physical CPU APIC ID values on the host. @@ -859,7 +858,7 @@ done echo " " ---- -The following commands can be used to determine the CPU details on the hypervisor host: +Use the following commands to determine the CPU details on the hypervisor host: ---- lscpu --extended=CPU,SOCKET,CORE @@ -873,7 +872,7 @@ It is not necessary to isolate the guest VM's `iothreads`, nor to statically res === Configuring networking One of the Virtual Functions prepared in <<_sec_configure_networking_on_hypervisor>> must be added to the guest VM as (one of) its network adapter(s). -This can be done by putting the following details in the guest VM configuration file: +This can be done by adding the following details to the guest VM configuration file: ---- @@ -904,7 +903,7 @@ The storage configuration is critical, as it plays an important role in terms of ==== Configuring storage for operating system volumes The performance of storage where the operating system is installed is not critical for the performance of SAP HANA. -Therefore any KVM supported storage may be used to deploy the operating system itself. See an example below: +Therefore, you might use any KVM-supported storage to deploy the operating system itself. See an example below: ---- @@ -932,20 +931,20 @@ The configuration depends on the type of storage used for the SAP HANA Database. In any case, the storage for SAP HANA must be able to fulfill the storage requirements for SAP HANA from within the VM. The SAP HANA Hardware and Cloud Measurement Tools (HCMT) can be used to assess if the storage meets the requirements. -For details on HCMT refer to {launchpadnotes}2493172[SAP Note 2493172 - "SAP HANA Cloud and Hardware Measurement Tools"]. +For details on HCMT, refer to {launchpadnotes}2493172[SAP Note 2493172 - "SAP HANA Cloud and Hardware Measurement Tools"]. ===== Network attached storage -Follow the SAP HANA specific best practices of the storage system vendor. -Make sure though that the VM has access to one or more dedicated 10 GiB (or better) Ethernet interfaces for the network traffic to the network attached storage. +Follow the SAP HANA-specific best practices of the storage system vendor. +Make sure that the VM has access to one or more dedicated 10 GiB (or better) Ethernet interfaces for the network traffic to the network attached storage. ===== Fibre Channel storage -Since storage controller passthrough is used (see <<_sec_storage_hypervisor>>), any LVM (Logical Volume Manager) and Multipathing configuration should, if wanted, be made inside the guest VM, always following the storage layout recommendations from the appropriate hardware vendor. +Since storage controller passthrough is used (see <<_sec_storage_hypervisor>>), any LVM (Logical Volume Manager) and Multipathing configuration should be made inside the guest VM, always following the storage layout recommendations from the appropriate hardware vendor. -The guest VM XML configuration must be based on the underlying storage configuration on the hypervisor (see <<_sec_storage_hypervisor>>) +The guest VM XML configuration must be based on the underlying storage configuration on the hypervisor (see <<_sec_storage_hypervisor>>). -Since the storage for HANA (`/data`, `/log` and `/shared` volumes) is performance critical, it is recommended to take advantage of an SAN HBA that is passed through to the guest VM. +Since the storage for HANA (`/data`, `/log` and `/shared` volumes) is performance-critical, it is recommended to take advantage of an SAN HBA that is passed through to the guest VM. Note that it is not possible to only use one function of the adapter, and both must always be attached to the guest VM. @@ -972,11 +971,11 @@ An example guest VM configuration with storage passthrough configured would look ---- -More details about how to directly assign PCI devices to a guest VM are described in section 14.7 "Adding a PCI Device" of the Virtualization Guide (https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-libvirt-config-virsh.html#sec-libvirt-config-pci-virsh). +More details about how to directly assign PCI devices to a guest VM are described in https://documentation.suse.com/sles/15-SP2/html/SLES-all/cha-libvirt-config-virsh.html#sec-libvirt-config-pci-virsh[section 14.7 "Adding a PCI Device" of the Virtualization Guide]. ===== Local storage -To achieve the best possible performance it is recommended to directly attach the block device(s) and/or raid controllers, which will be used as storage for the SAP HANA data files. If there is a dedicated raid controller available in the system that only manages devices and raid volumes that will be used in one single VM the recommendation is to connect it via PCI passthrough as described in the section above. If single devices need to be used (e.g. NVMe devices) you can connect those to the VM by doing something similar to this: +To achieve the best possible performance, it is recommended to directly attach the block device(s) and/or raid controllers, which will be used as storage for the SAP HANA data files. If there is a dedicated raid controller available in the system that only manages devices and raid volumes that will be used in one single VM, the recommendation is to connect it via PCI passthrough as described in the section above. If single devices need to be used (for example NVMe devices), you can connect those to the VM by doing something similar to the below: // TODO: Trockencode! Check this before publishing!!! ---- @@ -994,7 +993,7 @@ You can use either a vbd disk or a virtio-serial device (preferred) to set this [[_sec_clocks_timers]] === Setting up clocks and timers -Make sure that the clock timers are set up as follows, in the guest VM configuration file: +Make sure that the clock timers are set up in the guest VM configuration file as follows: ---- @@ -1011,8 +1010,8 @@ Make sure that the clock timers are set up as follows, in the guest VM configura [[_sec_features]] === Configuring special features -It is necessary to enable for the guest VM a set of optimizations that are specific for the cases when the vCPUs are pinned and have (semi-)dedicated pCPUs all for themselves. -This is done by having the following in the guest VM configuration file: +It is necessary to enable a set of optimizations for the guest VM that are specific for the cases when the vCPUs are pinned and have (semi-)dedicated pCPUs all for themselves. +You can do so by providing the following details in the guest VM configuration file: ---- @@ -1027,7 +1026,7 @@ This is done by having the following in the guest VM configuration file: ---- -Note that this is a requirement for making it possible to load and use the "`cpuidle-haltpoll`" kernel module inside of the guest VM OS (see <<_sec_cpuidle_haltpoll>>). +Note that this is a requirement for making it possible to load and use the "`cpuidle-haltpoll`" kernel module inside the guest VM OS (see <<_sec_cpuidle_haltpoll>>). [[_sec_guest_operating_system]] @@ -1051,7 +1050,7 @@ Install and configure {sles4sap} {slesProdVersion} and SAP HANA as described in: ==== Customizing the Linux kernel parameters of the guest Like the hypervisor host, the VM also needs special kernel parameters to be set. -To edit the boot options for the Linux kernel do the following: +To edit the boot options for the Linux kernel, do the following: . Edit [path]_/etc/default/grub_ and add the following boot options to the line "`GRUB_CMDLINE_LINUX_DEFAULT`". + @@ -1075,7 +1074,7 @@ When using a virtual disk for `vhostmd`, the virtual disk device must be world-r ==== Configuring the Guest at boot time The folling settings need to be configured at boot time of the VM. -To persist these configurations it is recommended to put the commands provided below into a script which is executed as part of the boot process. +To persist these configurations, it is recommended to put the commands provided below into a script which is executed as part of the boot process. ===== Disabling `irqbalance` @@ -1129,7 +1128,7 @@ echo $GROW_START > /sys/module/haltpoll/parameters/guest_halt_poll_grow_start ===== Setting the clock source -The clock source needs to be set to `kvm-clock`. +You need to set the clock source to `kvm-clock`: ---- echo kvm-clock > /sys/devices/system/clocksource/clocksource0/current_clocksource @@ -1137,14 +1136,14 @@ echo kvm-clock > /sys/devices/system/clocksource/clocksource0/current_clocksourc ===== Disabling Kernel Same Page Merging -Kernel Same Page Merging (KSM) needs to be disabled, like on the hypervisor (see <<_sec_no_ksm>>). +Kernel Same Page Merging (KSM) needs to be disabled, like on the hypervisor (see <<_sec_no_ksm>>): ---- echo 2 >/sys/kernel/mm/ksm/run ---- ===== Implementing automatic configuration at boot time -The following script is provided as an example for a script implementing above recommendations, to be executed at boot time of the VM. +The following script provides an example for a script implementing above recommendations, to be executed at boot time of the VM. .Script ---- @@ -1182,13 +1181,13 @@ else fi ---- -Both `sapconf` and `saptune` apply their settings at boot time automatically and do not need to be included in the script above. +Both `sapconf` and `saptune` apply their settings automatically at boot time and do not need to be included in the script above. [[_sec_guest_operating_system_storage_configuration_for_sap_hana_volumes]] === Configuring the guest operating system storage for SAP HANA volumes * Follow the storage layout recommendations from the appropriate hardware vendors. -* Only use LVM (Logical Volume Manager) inside the VM for SAP HANA. Nested LVM is not to be used. +* Only use LVM (Logical Volume Manager) inside the VM for SAP HANA. Nested LVM should not be used. [[_sec_performance_considerations]] @@ -1208,13 +1207,13 @@ Performance deviations for virtualization as measured on Intel Skylake (Bare Met * Setting `kvm.nx_huge_pages=auto` ** The measured performance deviation for OLTP or mixed OLTP/OLAP was impacted by this setting. -For S/4HANA standard workload, OLTP transactional request times show an overhead of up to 30 ms. -This overhead leads to an additional transactional throughput loss, but did not exceed 10%, running at a very high system load, when compared to the underlying bare metal environment. +For S/4HANA standard workloads, OLTP transactional request times show an overhead of up to 30 ms. +This overhead leads to an additional transactional throughput loss. However, it did not exceed 10%, running at a very high system load, when compared to the underlying bare metal environment. ** The measured performance deviation for OLAP workload is below 5%. ** During performance analysis with standard workload, most of the test cases stayed within the defined KPI of 10% performance degradation compared to bare metal. However, there are low-level performance tests in the test suite exercising various HANA kernel components that exhibit a performance degradation of more than 10%. This also indicates that there are particular scenarios which might not be suited for SAP HANA on SUSE KVM with kvm.nx_huge_pages = AUTO; especially those workloads generating high resource utilization, which must be considered when sizing SAP HANA instance in a SUSE KVM virtual machine. -Thorough test of configuration for all workload conditions are highly recommended. +Thorough tests of configurations for all workload conditions are highly recommended. @@ -1223,8 +1222,8 @@ Thorough test of configuration for all workload conditions are highly recommende For a full explanation of administration commands, refer to official SUSE Virtualization documentation such as: -* Section 11 "Basic VM Guest Management" and others in the SUSE Virtualization Guide for SUSE Linux Enterprise Server 15 (https://documentation.suse.com/sles/15-SP4/html/SLES-all/cha-libvirt-managing.html) -* SUSE Virtualization Best Practices for SUSE Linux Enterprise Server 15 (https://documentation.suse.com/sles/15-SP4/html/SLES-all/article-virtualization-best-practices.html) +* https://documentation.suse.com/sles/15-SP4/html/SLES-all/cha-libvirt-managing.html[Section 11 "Basic VM Guest Management"] and others in the https://documentation.suse.com/sles/15-SP4/html/SLES-all/book-virtualization.html[SUSE Virtualization Guide for SUSE Linux Enterprise Server 15] +* https://documentation.suse.com/sles/15-SP4/html/SLES-all/article-virtualization-best-practices.html[SUSE Virtualization Best Practices for SUSE Linux Enterprise Server 15] [[_sec_useful_commands_on_the_hypervisor]] @@ -1236,7 +1235,7 @@ Check kernel boot options used: cat /proc/cmdline ---- -Check huge page status (This command can also be used to monitor the progress of huge page allocation during VM start): +Check the huge page status (this command can also be used to monitor the progress of huge page allocation during VM start): ---- cat /proc/meminfo | grep Huge @@ -1248,7 +1247,7 @@ List all VM guest domains configured on the hypervisor: virsh list --all ---- -Start a VM (Note: VM start times can take some minutes on larger RAM systems, check the progress with `/proc/meminfo | grep Huge`: +Start a VM (and keep in mind that VM start times can take some minutes on larger RAM systems, check the progress with `/proc/meminfo | grep Huge`): ---- virsh start @@ -1266,7 +1265,7 @@ This is the location of VM guest configuration files: /etc/libvirt/qemu ---- -This is the location of VM Log files: +This is the location of VM log files: ---- /var/log/libvirt/qemu @@ -1297,16 +1296,16 @@ lscpu // TODO: XML config sample needs to be replaced with a CSL 6TB version!!! .XML configuration example -[WARNING] +[IMPORTANT] ==== The XML file below is only an *example* showing the key configurations to assist in understanding how to configure a valid VM in this environment via an XML file. The actual XML configuration must be based on your respective hardware configuration and VM requirements. ==== -Points of interest in this example (refer to the detailed sections of the *SUSE Best Practices for SAP HANA on KVM - {sles4sap} {slesProdVersion}* document at hand for a full explanation): +Points of interest in this example (refer to the detailed sections of the *SUSE Best Practices for SAP HANA on KVM* ({sles4sap} {slesProdVersion}) document at hand for a full explanation): * Memory -** The hypervisor has 6 TiB RAM (or 6144 GiB), of which 5544 GiB has been allocated as 1 GB huge pages and therefore 5544 GiB is the max VM size in this case +** The hypervisor has 6 TiB RAM (or 6144 GiB), of which 5544 GiB have been allocated as 1 GB huge pages and therefore 5544 GiB is the max VM size in this case ** 5544 GiB = 5813305344 KiB ** In the `numa` section memory is split evenly over the 4 NUMA nodes (CPU sockets) * CPU pinning @@ -1827,7 +1826,7 @@ or other application using the libvirt API. [[_sec_resources]] === Resources -* https://documentation.suse.com/sbp/sap/[SUSE Best Practices] +* https://documentation.suse.com/sbp-supported.html[SUSE Best Practices] * https://documentation.suse.com/sles/15-SP2/html/SLES-all/book-virt.html[SUSE Virtualization Guide for SUSE Linux Enterprise Server 15] * {launchpadnotes}3120786[SAP Note 3120786 - "SAP HANA on SUSE KVM Virtualization with SLES 15 SP2"] * {launchPadNotes}2284516[SAP Note 2284516 - "SAP HANA virtualized on SUSE Linux Enterprise Hypervisors"] @@ -1847,7 +1846,7 @@ For services and support options available for your product, refer to http://www To report bugs for a product component, go to https://scc.suse.com/support/ requests, log in, and select Submit New SR (Service Request). Report Documentation Bug:: -To report errors or suggest enhancements for a certain document, use the mailto:Report Documentation Bug[] feature at the right side of each section in the online documentation. +To report errors or suggest enhancements for a certain document, use the mailto:Report Documentation Bug[] icon at the right side of each section in the online documentation. Provide a concise description of the problem and refer to the respective section number and page (or URL). Mail::