-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CLOUD-2261] handling txn recovery by querying api #230
[CLOUD-2261] handling txn recovery by querying api #230
Conversation
if [ $probeStatus -eq 0 ] && [ "$(type -t probePodLog)" = 'function' ]; then | ||
# -- checking if server.log is clean from errors (only if function of the particular name exists) | ||
probePodLog # calling function from partitionPV.sh | ||
probeStatus=$? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you been able to verify that this is useful? By default, logging is sent to stdout. This may also levy requirements on the pod for storage related to logging (e.g. emptyDir volume mount).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, at my knowledge this is useful and currently I haven't found better way to find what is outcome of the unavailable resource. Even though I've created a jira with asking QA guys to create test for it: https://issues.jboss.org/browse/CLOUD-2519
On the asking about the server log probing it's not done by checking the file itself but by querying OpenShift API. That way those are data taken from stdout
not from a file in particular.
|
||
if [ $? -eq 0 ] ; then | ||
if [ $probeStatus -eq 0 ] && [ "$(type -t probePodLog)" = 'function' ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this check need to be able to discern the difference between an application failure and a probe failure (e.g. probe fails because server's taking too long to start)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, the check currently is not capable to discern the probe and application failure. Is your point the application failure (aka ERROR during app startup) should not cause migrate pod to be marked as not capable to start, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More that the probe might fail because the management port isn't available yet, e.g. if it were run immediately after startup, i.e. it's still initializing. There is logic in the probe script to accommodate retries, e.g. it retries 30 times with a one second delay after each failure, but we're thinking of moving that configuration into the container's probe configuration directly, which could affect usage here. This is just something to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was (at least aware) of the fact there are some retries for the startup initialization. The function which pobes the log (aka. downloading the log by OpenShift API call) waits while the readiness probe returns success. At that point the script expects the container is started and it checks the content of the log.
I'm not sure what is meant by the "container's probe configuration". I expect such refactoring could affect this PR but it could be that not much. The script here just waits for get a started app server. If it's possible to find the status of the server (started ok/error on starting) then the functionality here will be just the same as it's presented in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I was referring to: https://docs.openshift.com/container-platform/3.9/dev_guide/application_health.html
In addition to what's listed there, you can also specify periodSeconds
, successThreshold
and failureThreshold
, the first and last of which could replace the settings in the probe itself. That's what I was talking about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. Do I get it right that the probe script will be the same as it's now (https://github.com/jboss-openshift/application-templates/blob/master/eap/eap71-tx-recovery-s2i.json#L378) but you use the periodSeconds
instead of the COUNT
which is currently in use (https://github.com/ochaloup/cct_module/blob/master/os-eap-probes/added/readinessProbe.sh#L9)? How this will impact the openshift-migrate-common.sh
script? I mean you still need to somehow determine that the app server is ready to serve the cli commands that the function runMigration()
use. The probe script will be needed to be run manually in the script, or how that will work?
Anyway I'm happy to change the code in any way you consider being the best to follow the new practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think anything needs to change. I'm just pointing out that if we go this route, we need to ensure we preserve this type of functionality in the probes. There are other products that are looking at switching to configuring some of this on the pods, because it works better for their use case. That said, I think you could hard code those environment variables in your migration script to future proof yourself, should the defaults ever change (e.g. go to 1/1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I'm just now thinking if it influences this verification of the server log content. I mean I named it as probe but it was not meant to be used for checking container readiness. It was meant to check if the migration action could be processed. The migration pod itself is already marked as live and ready but this part of checking plays role during the migration pod starting app server which then process the migration action. If the app container is not ready to run or there is some problem in server log it does not mean the migration pod is marked as "a not ready". It just means it tries to start the app server once again in a while.
@@ -8,7 +8,7 @@ LOG=/tmp/readiness-log | |||
|
|||
COUNT=30 | |||
SLEEP=5 | |||
DEBUG=false | |||
DEBUG=${SCRIPT_DEBUG:-false} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems out of scope, but innocuous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this helps quite nice during investigation. The cct_module
defines the SCRIPT_DEBUG
env variable to cause set -x
to be set up. Here we got debug information from the python readiness probe as well when the env variable is set. I used that with success while investigating errors.
@@ -0,0 +1,202 @@ | |||
source ${JBOSS_HOME}/bin/launch/openshift-node-name.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't these be rolled into the existing partitionPV.sh script? My concern is that we're diverging and will have to start maintaining these scripts in parallel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean the os-eap-txn-partition
vs. os-partition
module, right? That's what I was pretty unsure how to do so. The os-partition
module is not used only for the txn recovery but for message migration too. But my concern and testing was the txn recovery. If you think I can try to merge those two modules to one but still there will need to be (probably?) two kinds of algorithms - one for split-lock
approached used in messaging case, the other for txn recovery
. Or if you will I can try to remove the split-lock
approach completely and replace with the txn api query
approach. Just currently I'm not sure what issues could turn up.
What is you idea about the approach? One os-partion
based on the txn recovery algorithm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the two were delegating the recovery work and other app specific details. This script it sourced by the product scripts, which provide implementations for runMigration()
. I would expect the same paradigm to be used. I believe the basic script implements a strategy pattern (e.g. split the dir, add markers, cleanup, etc.), where certain steps have product specific implementations (e.g. runMigration()
). bash is a bit of a pita for this, but I think that's why we're sourcing the file: so the consumer can provide implementations for the stubs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Just I'm sorry for that but I do not follow what should be changed in my case. I was thinking in the similar way (even not specified such precisely as you did). There is a new module os-eap-txn-partition
which provides implementation of the migration itself. The root scripts (which calls the runMigration()
) are untouched (...ok, that's not true as I needed a way to provide a bit different parameters than it was originally intended. Still I was trying to not touch any original behavior and just adding a line to get my implementation working).
@rcernich would you be so kind and give me few pointers what you expect to be done? Many thanks!
TOKEN_FILE_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/token' | ||
NAMESPACE_FILE_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/namespace' | ||
CERT_FILE_PATH = '/var/run/secrets/kubernetes.io/serviceaccount/ca.crt' | ||
STATUS_LIVING_PODS = ['Pending', 'Running', 'Unknown'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to be concerned with pods that might be bouncing, e.g. crash loop backoff? (I'm not sure what state those would be in, or if there would be anything to recover if that happened. Just asking.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I think we need to be concerned but I got the reference for the pod possible status values that could be received from the API from the kubernetes doc: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-and-container-status
I could be wrong but I hope this is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Just checking.
4252db2
to
5e9de06
Compare
hi @rcernich . I cooperate with @tremes on testing this feature, see https://issues.jboss.org/browse/CLOUD-2261. It's now approved from the test point of view. Is there something more that holds this PR to be merged? Do you expect some further work on integration of the similar approach for the messaging? |
I was hoping that the new split logic would be done in a generic way, such that it could be used for both EAP and AMQ like the current logic. What's generic here is splitting a "data" directory for a cluster and cleaning up the partitions once its owner has been terminated. Is it no longer possible to use the same mechanism for this? |
5e9de06
to
f30c044
Compare
hi @rcernich I've finally changed my PR for all tenants of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Structurally, this looks good to me.
95ccff6
to
8145f43
Compare
I've haven't done any structural change. I just fixed function |
This set of changes doesn't work with AMQ. The build fails because JBOSS_HOME is not set. Also, partitionPV.sh makes use of openshift-node-name.sh, which is placed in JBOSS_HOME/bin/launch, which would not be available in an AMQ image. I'll see if I can rework things to get something built. |
@rcernich I see, that's my omission. Let me know, please, whether I should start digging into this too. Thanks. |
I'm working through the builds and the unit tests. I've got amq63 built and am working through test failures with eap64. Haven't tried to build eap71 yet. |
f324ca2
to
fa29ac3
Compare
a5d7d1b
to
6d4dafa
Compare
6d4dafa
to
909af50
Compare
As agreed with @rcernich I updated the PR in way of using two migration approaches (not only one merged for use of amq/jdg and eap). The eap crashrecovery would use the query api approach where directories are named by pod name. The amq/jdg are expected to use the flock solution where directories are named with numbers like split-#. |
…plit functionality
909af50
to
b918476
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks okay to me.
https://issues.jboss.org/browse/CLOUD-2261
hopefully non-invasive addition of the txn recovery mechanism based on the algorithm presented at https://docs.google.com/document/d/1JdOTMP6pdexJ__KYlw9hgwR6DOCHTtYCQy7PqyMk8OU/edit which addresses issue mentioned at CLOUD-2261
/cc @rcernich