-
Notifications
You must be signed in to change notification settings - Fork 293
xapi_ha: avoid raising Not_found when joining a liveset #6734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
I thought I had make-checked it after rebasing on top of master. Some code changes used the changed functions, so I need to make further changes |
| List.iter | ||
| (fun sr -> | ||
| let vdi = Xha_statefile.find_or_create ~__context ~sr ~cluster_stack in | ||
| statefile_vdis := vdi :: !statefile_vdis | ||
| ) | ||
| srs ; | ||
| [sr] ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could drop this List.iter since the list only has a single element
| List.iter | ||
| (fun (_, exn) -> | ||
| (* Perform a disable since the pool HA state isn't consistent *) | ||
| error "Attempting to disable HA pool-wide" ; | ||
| Helpers.log_exn_continue | ||
| "Disabling HA after a failure during enable" disable_internal | ||
| __context ; | ||
| raise exn | ||
| ) | ||
| errors ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This and above changed the code from raising the first exception to raising all of them one after another - I guess it'd be hard for anything to depend on this, so it should be alright (might increase noise in the logs though)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we don't have resumable exceptions, only the first one is actually reported, or none at all
The Not_found and hd exceptions keep popping up, and it's difficult to find them when there are no backtraces logged. Remove usages if them, even if they are not problematic so the actual problematic ones can be flushed out over time. Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
Until now, the valid string values were not written anywhere, change the situation. Ideally this would be done using an enum in the idl, but unfortunately this changes types of existing parameters in API calls, so it's quite risky. Instead have a conservative approach to only enumerate the valid values and make Constants the only source of truth for these values, including default ones, which were scatterred around previously. Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
Previously it was opened if the default stack was selected, but this could actually be different from XHAd Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
Allows callers to avoid exceptions, the previous get is now called get_exn so it's clear which users have to be have to be mindful of exceptions. Not all get_exn were converted to get, previous behaviour was widespread, and doing the change without changing behaviour is not trivial, better to do it only once its detected. Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
Joining the liveset has been found to raise `Not_found` in some cases. Transform these exceptions to others that are more readable and show the exact cause. Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
Signed-off-by: Pau Ruiz Safont <pau.safont@vates.tech>
|
Fixed another bug because the cluster stack values are stringy, so I created a variant to model them. I would have like to use an enum at the API level, but that is very invasive and we know how problematic that can be |
An xcp-ng user reported a failure when enabling HA, with the only error being a Not_found, with this change we now know that it's because the IP of the coordinator is not present in the local database's
ha_peersvalue.While doing this: