Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sap_swpm: New functionality to push SUM through the manual steps #875

Open
wants to merge 12 commits into
base: dev
Choose a base branch
from

Conversation

rob0d
Copy link
Contributor

@rob0d rob0d commented Oct 9, 2024

This will push SUM through the manual steps in order to fully automate the installation of the system when sap_swpm_sum_start is enabled.

If SWPM is in observer mode, it will not do anything as SWPM waits for SUM to complete.

Updated code (correcting for whitespace in HTTP Response Body) from Shell script for SUM provided by @sean-freeman in #738 (comment)

@berndfinger
Copy link
Member

@rob0d Thanks for this contribution, this looks really cool and seems to be nice additional functionality without adding too much complexity. Will soon have some time for testing.

@berndfinger
Copy link
Member

@rob0d - I'd ran a first test but it timed out in task Checking the status of SUM after 60 retries. Can you please explain which SAP software you used and if and how you prepared the test? I will then try to use the same scenario for my next test.

@rob0d
Copy link
Contributor Author

rob0d commented Nov 8, 2024

Hi @berndfinger,
Can you check what was the response from ansible.builtin.uri on line 8 of roles/sap_swpm/tasks/post_install/sum_push_to_finish.yml?
It couldn't connect or the response wasn't what it expected (tests on line 20)?
If it's still running or if you can restart it try connecting to the SUM URL and check what it says there.

Also did you provide stack file to SUM?
For SUM to start correctly and start executing you need to set the following (at least this is what I set:
sap_swpm_mp_stack_path:
sap_swpm_mp_stack_file_name: . #MP_Stack_{{ sap_system_sid }}.xml
sap_swpm_configure_tms: 'true'
sap_swpm_sum_prepare: 'true'
sap_swpm_sum_start: 'true'
sap_swpm_tmsadm_password:

@berndfinger
Copy link
Member

Hi @berndfinger, Can you check what was the response from ansible.builtin.uri on line 8 of roles/sap_swpm/tasks/post_install/sum_push_to_finish.yml? It couldn't connect or the response wasn't what it expected (tests on line 20)? If it's still running or if you can restart it try connecting to the SUM URL and check what it says there.

It couldn't connect, and the reply of a wget --no-check-certificate on the URL was:

HTTP request sent, awaiting response... 404 Not Found

But it is probably related to the missing parameters for SUM to start correctly. To prepare for this, I need more time for preparing and testing with a MP stack file. Unfortunately, I also have to work on some other open issues and PRs for this collection so I cannot immediately continue with testing this PR.

But maybe someone else can jump in and test the proposed code?

@rob0d Can you please provide more details about the software you configured in Maintenance Planner (e.g. product name, version, any additional selections)?

@rob0d
Copy link
Contributor Author

rob0d commented Nov 12, 2024

Hi @berndfinger,

It looks like the SUM wasn't started by SWPM due to the missing parameters.

The process with Maintenance Planner is something like this:

  • Create maintenance transaction in MP. I am using S/4 HANA 2023 with all current patches.
  • Download the XML file.
  • Download all the other files (or use Ansible role to do it).
  • Set the parameters above and point it at the XML file.
  • Run SWPM, it should start SUM.
  • SUM will run for a while until it stops for a prompt. This is the point the Ansible SUM script takes over. It keeps checking and waiting until SUM is in the correct phase then does a POST to move it to the next step.
  • The next step is the main upgrade which on our systems takes 8 hours (full S/4 HANA SP1 + all patches). Again, Ansible script loops and waits until SUM reaches the end of the process.
  • There is a prompt at this stage asking for SPAU processing. As this is a new install, we just tick the box and continue.
  • Finally Ansible script waits for SUM to complete the processing and the swpm role finishes.

@sean-freeman
Copy link
Member

Git diff is misbehaving in this PR, only Ansible Task SAP SWPM Post Install - Control SUM if required is appended to /tasks/post_install.yml but the diff attempts to replace the entire file.

Test has begun for 1 new task + new task file with the updated v1.5.x codeline

@sean-freeman sean-freeman self-requested a review December 5, 2024 13:09
@sean-freeman
Copy link
Member

sean-freeman commented Dec 6, 2024

@rob0d Why restriction to PAS or OneHost? Personally I'd only ever install from PAS, but it has been questioned before so unless there is a referenceable reason - restricting is not the best.

Feedback:

  • sap_swpm_sum_start is a boolean, so when condition == 'true' suffix is not required / will cause the Ansible Task to be skipped.
  • Checking the status of SUM Ansible Task name is duplicated, which is confusing to end-user which stage SAP SUM is at. Edit - completed 2025-01-03
  • Get the config XML from SUM Ansible Task name is duplicated, see above. Edit - completed 2025-01-03
  • Get the CSRF token from SUM Ansible Task name is duplicated, see above. Edit - completed 2025-01-03
  • Move SUM to the next step Ansible Task name is duplicated, see above. Edit - completed 2025-01-03
  • in _sap_swpm_sum_push.content has double space after in operator. Edit - completed 2025-01-03
  • delay: 600 # 10 minutes is inconsistent with waits throughout the Ansible Collection, this is better set to 60 for consistency - even if it increases stdout.
  • I would suggest adding message before the run, informing the user to...
Check the following URLs for SAP Software Update Manager (SUM) monitoring or actions:

SUM Monitor - https://{{ ansible_fqdn }}:1129/lmsl/sumobserver/S01/monitor/index.html
SUM Admin - https://{{ ansible_fqdn }}:1129/lmsl/sumabap/S01/slui/
SUM Admin Utilities - https://{{ ansible_fqdn }}:1129/lmsl/sumabap/S01/slui_ext/

@rob0d
Copy link
Contributor Author

rob0d commented Dec 6, 2024

Git diff is misbehaving in this PR, only Ansible Task SAP SWPM Post Install - Control SUM if required is appended to /tasks/post_install.yml but the diff attempts to replace the entire file.

Test has begun for 1 new task + new task file with the updated v1.5.x codeline

Yes. I don't understand why. It's showing a conflict even though there doesn't seem to be one. I am not able to resolve it as I don't have write access to the repo.

@berndfinger
Copy link
Member

Git diff is misbehaving in this PR, only Ansible Task SAP SWPM Post Install - Control SUM if required is appended to /tasks/post_install.yml but the diff attempts to replace the entire file.
Test has begun for 1 new task + new task file with the updated v1.5.x codeline

Yes. I don't understand why. It's showing a conflict even though there doesn't seem to be one. I am not able to resolve it as I don't have write access to the repo.

@rob0d - I found:

  • The line:
# SPDX-License-Identifier: Apache-2.0

is missing at the top.

  • Changed Files shows that the when condition of the task `SAP SWPM Post Install - Firewall Setup' had:
"sap_swpm_setup_firewall | bool"

whereas the current dev tree has:

"sap_swpm_setup_firewall"

Maybe the easiest way to solve this is that you modify your files accordingly. If that doesn't help, you could of course also create a new PR based on the current dev tree (or I could do that - but I'd rather have your userid in the git commit history ;-) ).

@berndfinger
Copy link
Member

Solves #919.

@berndfinger
Copy link
Member

@rob0d As there was no reply from you, I will now try to resolve the conflicts using the Resolve conflicts button.

@berndfinger
Copy link
Member

@rob0d Unfortunately, my attempt to resolve the merge conflict using the Resolve conflicts button of the web interface failed, and the result are now duplicate lines in file tasks/post_install.yml. Maybe I missed to remove one of the merge conflict indicators.
I will now try to roll back this change after cloning your repo. If this fails, can you please revert my changes?

Signed-off-by: Bernd Finger <bfinger@redhat.com>
... to new file tasks/post_install/sum_push_to_finish.yml

Signed-off-by: Bernd Finger <bfinger@redhat.com>
@berndfinger
Copy link
Member

@rob0d I failed removing my commit so I added one more commit for resolving the merge conflict. I also added one more commit for adding the missing SPDX identifier to file roles/sap_swpm/tasks/post_install/sum_push_to_finish.yml.
Now trying to fix the ansible-lint and ansible-test sanity errors.

Signed-off-by: Bernd Finger <bfinger@redhat.com>
@berndfinger
Copy link
Member

@rob0d Fixing the ansible-lint errors requires a code change in file tasks/post_install/sum_push_to_finish.yml, likely replacing the Ansible shell module by the command module. Also this task needs a change_when: false condition. So I cannot proceed with merging at this time because I don't want to interfere more with your work, and I would like you to review and decide how to solve the ansible-lint errors.

@rob0d
Copy link
Contributor Author

rob0d commented Dec 19, 2024

Hi @berndfinger,
Thanks for looking at it. I will have a look at the lint errors and fix them. I also have some further updates that I will add.

@sean-freeman
Copy link
Member

sean-freeman commented Jan 7, 2025

@rob0d Remaining query and feedback to be addressed after the additional commits

Query 1: Why restriction to PAS or OneHost? (Personally I'd only ever install to target host from the PAS host as a good practice, but it has been questioned before so unless there is a referenceable reason - restricting is not the best.)

Feedback items:

  • sap_swpm_sum_start is a boolean variable type (see https://github.com/sap-linuxlab/community.sap_install/blob/main/roles/sap_swpm/defaults/main.yml#L194), so the when condition == 'true' suffix is not required; as this would check for a variable containing string true. This needs to be addressed before merge, as it would cause the Ansible Task to be skipped accidentally / cause errors. Edit - completed 2025-01-13

  • delay: 600 # 10 minutes is inconsistent with waits throughout the Ansible Collection, this is better set to 60 for consistency across the various Ansible Roles (and with the sapinst wait within this sap_swpm Ansible Role) - even if it increases stdout by printing every minute (in many cases this is useful, as the number of retries can be easily counted to see how long something took). Edit - completed 2025-01-13

  • I would suggest adding message before the run, informing the user to... Edit - completed 2025-01-13

Check the following URLs for SAP Software Update Manager (SUM) monitoring or actions:

SUM Monitor - https://{{ ansible_fqdn }}:1129/lmsl/sumobserver/S01/monitor/index.html
SUM Admin - https://{{ ansible_fqdn }}:1129/lmsl/sumabap/S01/slui/
SUM Admin Utilities - https://{{ ansible_fqdn }}:1129/lmsl/sumabap/S01/slui_ext/

@rob0d
Copy link
Contributor Author

rob0d commented Jan 8, 2025

Hi @sean-freeman,

Sorry for not answering earlier. I was trying to push all the code I have so you guys can have a look and possibly test.

  1. SUM only gets executed on PAS or when running Onehost on CI.
  2. sap_swpm_sum_start will fix
  3. SUM will run for around 8 hours (longer for bigger stacks), so this would generate at least 480 messages making the log very long with everything else drowned by this. This is why I set it to 10 mins. Do you definitely want to change it to 60s? :)
  4. Info message - will add.

@sean-freeman
Copy link
Member

@rob0d no worries, easy to lose track in the complexity + the holiday period

  1. Some would argue, that SAP SWPM could be executed from any host (e.g. AAS, or Jump Box) and in the variables set SAP SWPM to target the PAS/Onehost (this is why we haven't specifically locked functionality in the past, as unadvised actions were preferred by some users). If we can back it up with a referenced SNote or help.sap.com article, I'm fine with locking.
  2. sap_swpm_sum_start
  3. delay: 600 # 10 minutes I will request @berndfinger to comment. Personally I'd think consistency is better and that ~500 lines of stdout is not much compared to the rest of a Playbook or the other output from sap_swpm Role. I'm not overly bothered, it's a prettification concern and has no impact on functionality itself.
  4. Info stdout ✅

Do you want to target this PR for 1.5.1 that is imminent, or 1.5.2 in a few weeks? The PR is ready except for minor stuff.

@berndfinger
Copy link
Member

berndfinger commented Jan 13, 2025

  1. delay: 600 # 10 minutes I will request @berndfinger to comment. Personally I'd think consistency is better and that ~500 lines of stdout is not much compared to the rest of a Playbook or the other output from sap_swpm Role. I'm not overly bothered, it's a prettification concern and has no impact on functionality itself.

For consistency (users might always expect one minutes interval for screen updates for such tasks), one minute would be the interval of choice. However, there is not much value in such a short interval except that in case of an error, the longest wait time is only one minute. If this task is expected to executed successful in most cases, we might as well choose a longer interval (e.g. 5 minutes). 10 minutes is maybe a bit too long. In any case, I think the task name should mention the update interval (e.g. name: Checking the status of SUM (SPAUINFO) every 5 minutes) so that users are made aware that there are longer than usual update intervals for this task.

@rob0d
Copy link
Contributor Author

rob0d commented Jan 13, 2025

Hi Guys,
Latest commit has some bugfixes that my previous commit introduced. I've retested the code and now it is correctly skipping to the end if SUM didn't stop in SPAUINFO phase. Also I've adjusted the interval to 5 mins with the message as per Bernd's suggestion.

Regarding 1)

Can't find too many details. However, SUM can run only on server which has application instance installed (https://help.sap.com/docs/SLTOOLSET/b0d775c8a9fd47b985a5e66ddba31d33/b255275804364a52ab5ffebf7ae4bdb4.html) and SWPM will simply not start SUM unless it's a PAS or OneHost installation. SUM can run only once an app server is installed and PAS is the first one so that the first time SUM can be started. If you run SWPM to install AAS or DB or anything else, it will skip SUM invocation regardless of what parameters you give it.

Relevant notes and extra doc (but none of them specifically says the stuff above):

@sean-freeman
Copy link
Member

@rob0d Good finds! Although I agree, not 100% clear.

"The Software Update Manager can run on any host with an application server instance." from link 1, suggests any PAS/AAS could work. However, back to the original comment - this will run after the Database Load (using SWPM run NW_ABAP_DB***), which historically would run from the PAS (although, I don't recall seeing that documented anywhere either).

Draft statement for code comment:

The Software Update Manager can run on any host with an Application Server instance (e.g. NWAS ABAP PAS/AAS, NWAS JAVA CI/AAS) with correct permissions to access /usr/sap/ and /sapmnt/ directories.

However, when using the Software Provisioning Manager (SWPM) with a Maintenance Planner Stack XML file to perform an "up-to-date installation" (UDI) - it will start the Software Update Manager (SUM) automatically at the end of the installation process, and this must be performed from the Primary Application Server instance (e.g. NWAS ABAP PAS, NWAS JAVA CI, or a OneHost installation).

Furthermore, during SWPM variable selection the enabling of Transport Management System (TMS) is required, see SAP Note 2522253 - SWPM can not call SUM automatically when doing the up-to-date installation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants