-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes for OWLS-83136 - Limit concurrent pod shutdowns during a cluster shrink #1892
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see unit tests to verify that your setters work. I see no unit tests for the actual change in server shutdown behavior, which is what is really important.
The unit tests for server shutdown behavior is in ServerDownIteratorStepTest.java which is newly added file. I have added DelayedPodAwaiterStepFactory class in PodHelperTestBase.java to simulate the behavior of waiting for pod to be deleted before proceeding to delete next pod. Please let me know if I missed something. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, it looks like a thorough job. My comments are mostly about code readability and maintainability.
operator/src/test/java/oracle/kubernetes/operator/steps/ServerDownIteratorStepTest.java
Outdated
Show resolved
Hide resolved
domainPresenceInfo.isServerPodBeingDeleted(MS3), is(Boolean.TRUE)); | ||
} | ||
|
||
@NotNull |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code can be simplified by extracting a method and using Stream.collect:
@NotNull
private List<ServerShutdownInfo> createServerShutdownInfosForCluster(String clusterName, String... servers) {
return Arrays.stream(servers).map(s -> createShutdownInfo(clusterName, s)).collect(Collectors.toList());
}
private ServerShutdownInfo createShutdownInfo(String clusterName, String serverName) {
return new ServerShutdownInfo(configSupport.getWlsServer(clusterName, serverName).getName(), clusterName);
}
return serverShutdownInfos; | ||
} | ||
|
||
@NotNull |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code can be simplified:
@NotNull
private List<ServerShutdownInfo> createServerShutdownInfos(String... servers) {
return Arrays.stream(servers).map(this::createShutdownInfo).collect(Collectors.toList());
}
@Nonnull
private ServerShutdownInfo createShutdownInfo(String server) {
return new ServerShutdownInfo(configSupport.getWlsServer(server).getName(), null);
}
testSupport | ||
.addDomainPresenceInfo(domainPresenceInfo); | ||
|
||
invokeStepWithServerShutdownInfos(createServerShutdownInfosForCluster(CLUSTER,MS1, MS2)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line that executes the code you're testing is repeated in each test, with the only real differences being that in some tests you create infos without a cluster, in some you create them with a cluster, and in one you create them for multiple clusters. As written now, it is hard at first glance to see what is going on.
This could be clarified in a number of ways. Maybe the cleanest would use a builder pattern, so that you could do something like.
createShutdownInfos()
.forServers(MS1, MS2)
.forClusteredServers(CLUSTER, MS3, MS4)
.shutdown();
That would greatly improve readability of the tests. Also, please make sure to leave a blank line between the execution portion of each unit test and the assertions.
.addDomainPresenceInfo(domainPresenceInfo); | ||
|
||
invokeStepWithServerShutdownInfos(createServerShutdownInfosForCluster(CLUSTER,MS1, MS2)); | ||
MatcherAssert.assertThat( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get in the habit of doing static imports for assertThat
. Showing the class name doesn't help readability.
In addition, you invoke DomainPresenceInfo.isServerPodBeingDeleted a lot. It would be worthwhile to have a method that simply returns the names of all of the pods being deleted. That way you could do a single assert on the collection.
public void withConcurrencyOf0_clusteredServersShutdownConcurrently() { | ||
configureCluster(CLUSTER).withMaxConcurrentShutdown(0); | ||
addWlsCluster(CLUSTER, PORT, MS1, MS2); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove blank lines in the middle of your test setup, reserving them as dividers between setup, execution, and validation.
} | ||
|
||
@Test | ||
public void withReplicaCountOf0AndConcurrencyOf1_clusteredServersShutdownConcurrently() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is going on in this test? Why do the servers shut down concurrently with a concurrency of 1? There seems to be something important going on, and the name of the test doesn't clarify it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OWLS-83136 has below requirement -
If a WL cluster shrinks to '0' (i.e. replica count is set to 0
), then this indicates that the administrator isn't concerned about preserving replicated/volatile state, there's no need to shutdown one-at-a-time, and we should do a concurrent shutdown. We ignore the concurrency value when replica count is set to 0
.
I can look into changing the test name to clarify it. Please let me know if you have any suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about whenClusterShutdown_concurrencySettingIsIgnored
? (replica count set to 0 means the cluster is shutting down, which is why we don't care about concurrency)
} | ||
|
||
@Test | ||
public void withConcurrencyOf2AndReplicaCount1_3clusteredServersShutdownIn2Threads() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is going on in this test? Why do the servers shut down in separate threads? There seems to be something important going on, and the name of the test doesn't clarify it. Is this actually two separate cases? If so, perhaps it should be two tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's meant to test that 3rd clustered server is terminated only after one of the previous two servers is completely shutdown (since concurrency is 2). I'll change the name to withConcurrencyOf2AndReplicaCount1_3rdClusteredServerIsShutdownAfterPreviousPodTerminated(). Please let me know if this is not clear or if you have any suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with the name is that it doesn't really inform the reader of what is going on. The point, apparently, is that the concurrency setting limits the number of simultaneous servers shutting down. The replica count is only relevant in that you have reduced it.
So maybe something like, whenMaxConcurrentShutdownSet_limitNumberOfServersShuttingDownAtOnce
} | ||
|
||
@Test | ||
public void withConcurrencyOf2AndReplicaCount0_4clusteredServersShutdownConcurrently() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is going on in this test? What is the significance of the concurrency of 2, here? The name doesn't explain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is similar to withReplicaCountOf0AndConcurrencyOf1_clusteredServersShutdownConcurrently(). When replica count is set to 0
, it indicates that the administrator isn't concerned about preserving replicated/volatile state, there's no need to shutdown one-at-a-time. We ignore the concurrency value in when replica count is set to 0
. This test has 4 servers and concurrency value is set to 2 (as opposed to 2 servers and concurrency of 1 in withReplicaCountOf0AndConcurrencyOf1_clusteredServersShutdownConcurrently()).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it tests the same thing, only with multiple clusters? Use more or less the same wording suggested above, then, only indicate multiple clusters. I'm not sure how much this adds. Was any additional code needed to make this pass that wasn't needed for the other test?
OTOH, it might be interesting to prove that you can shut down one cluster without pay attention to concurrency limits while simultaneously shrinking another cluster and have that one pay attention to those limits.
@@ -1183,6 +1184,29 @@ public Step deletePodAsync( | |||
responseStep, new RequestParams("deletePod", namespace, name, deleteOptions, domainUid), deletePod); | |||
} | |||
|
|||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ability to add a custom retry strategy seems pretty general and you seem to be adding a fair bit of code to make it work only for the deletePod case. A better pattern, I think, would be to add a withRetryStrategy
method to CallBuilder, analogous to withFieldSelector
and withLabelSelectors
and pass it along in that fashion. That way it could be used in other cases. At least look at how difficult that would be to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@russgold Thanks for your review comments. I have addressed them in latest commit, please let me know if I missed something.
} | ||
|
||
@Test | ||
public void withReplicaCountOf0AndConcurrencyOf1_clusteredServersShutdownConcurrently() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OWLS-83136 has below requirement -
If a WL cluster shrinks to '0' (i.e. replica count is set to 0
), then this indicates that the administrator isn't concerned about preserving replicated/volatile state, there's no need to shutdown one-at-a-time, and we should do a concurrent shutdown. We ignore the concurrency value when replica count is set to 0
.
I can look into changing the test name to clarify it. Please let me know if you have any suggestions.
} | ||
|
||
@Test | ||
public void withConcurrencyOf2AndReplicaCount0_4clusteredServersShutdownConcurrently() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is similar to withReplicaCountOf0AndConcurrencyOf1_clusteredServersShutdownConcurrently(). When replica count is set to 0
, it indicates that the administrator isn't concerned about preserving replicated/volatile state, there's no need to shutdown one-at-a-time. We ignore the concurrency value in when replica count is set to 0
. This test has 4 servers and concurrency value is set to 2 (as opposed to 2 servers and concurrency of 1 in withReplicaCountOf0AndConcurrencyOf1_clusteredServersShutdownConcurrently()).
} | ||
|
||
@Test | ||
public void withConcurrencyOf2AndReplicaCount1_3clusteredServersShutdownIn2Threads() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's meant to test that 3rd clustered server is terminated only after one of the previous two servers is completely shutdown (since concurrency is 2). I'll change the name to withConcurrencyOf2AndReplicaCount1_3rdClusteredServerIsShutdownAfterPreviousPodTerminated(). Please let me know if this is not clear or if you have any suggestions.
@@ -958,6 +975,20 @@ public void setMaxClusterConcurrentStartup(Integer maxClusterConcurrentStartup) | |||
this.maxClusterConcurrentStartup = maxClusterConcurrentStartup; | |||
} | |||
|
|||
private int getMaxConcurrentShutdownFor(Cluster cluster) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simpler way to do this would be:
return Optional.ofNullable(cluster).map(Cluster::getMaxConcurrentShutdown).orElse(getMaxClusterConcurrentShutdown());
where getMaxClusterConcurrentShutdown()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
} | ||
|
||
@Test | ||
public void whenNotSpecified_maxConcurrentShutdownFromDomain() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should not be needed. The point is that the cluster is not configured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@russgold I have made changes to unit test method names and changed withConcurrencyOf2AndReplicaCount0_4clusteredServersShutdownConcurrently() to use 2 clusters such that one cluster is shut down (with 0
replicas) while simultaneously shrinking another cluster. The concurrency limit is ignored for shutting down cluster and honored for shrinking cluster. The name of new method is withMultipleClusters_concurrencySettingIsIgnoredForShuttingDownClusterAndHonoredForShrinkingCluster().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the possible exception of the AsyncRequestStep changes, which I expect Ryan to look at, this looks fine to me.
docs/domains/Domain.json
Outdated
@@ -117,6 +117,11 @@ | |||
"description": "Customization affecting Kubernetes Service generated for this WebLogic cluster.", | |||
"$ref": "#/definitions/KubernetesResource" | |||
}, | |||
"maxConcurrentShutdown": { | |||
"description": "The maximum WebLogic Server instances that will shutdown in parallel for this cluster when it is being partially shutdown by lowering its replica count.A value of 0 means there is no limit. Defaults to `spec.maxClusterConcurrentShutdown` (which in turn defaults to 1).", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rosemarymarano, can you please also review this description? Thanks. A couple of proposed edits on this description... 1. "The maximum" -> "The maximum number of". 2. "shutdown" -> "shut down". As a verb, it's "shut down", while "shutdown" is a noun. 3. "count.A value" -> "count. A value" (add space) 4. " (which in turn defaults to 1)" -> ", which defaults to 1".
@@ -208,6 +209,15 @@ | |||
@Range(minimum = 0) | |||
private Integer maxClusterConcurrentStartup; | |||
|
|||
@Description( | |||
"The default maximum WebLogic Server instances that a cluster will shutdown in parallel when it is being " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surprisingly, this text is slightly different. "The default maximum number of ". "shutdown" -> "shut down". "attribute" -> "field". "Defaults to 1." (add period).
@@ -886,7 +902,8 @@ public boolean equals(Object other) { | |||
.append(configOverrides, rhs.configOverrides) | |||
.append(configOverrideSecrets, rhs.configOverrideSecrets) | |||
.append(isAllowReplicasBelowMinDynClusterSize(), rhs.isAllowReplicasBelowMinDynClusterSize()) | |||
.append(getMaxClusterConcurrentStartup(), rhs.getMaxClusterConcurrentStartup()); | |||
.append(getMaxClusterConcurrentStartup(), rhs.getMaxClusterConcurrentStartup()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing change to toString()
@@ -224,6 +241,7 @@ public boolean equals(Object o) { | |||
.append(maxUnavailable, cluster.maxUnavailable) | |||
.append(allowReplicasBelowMinDynClusterSize, cluster.allowReplicasBelowMinDynClusterSize) | |||
.append(maxConcurrentStartup, cluster.maxConcurrentStartup) | |||
.append(maxConcurrentShutdown, cluster.maxConcurrentShutdown) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing change to toString()
@@ -88,6 +88,15 @@ | |||
@Range(minimum = 0) | |||
private Integer maxConcurrentStartup; | |||
|
|||
@Description( | |||
"The maximum WebLogic Server instances that will shutdown in parallel " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The maximum number of WebLogic". "shutdown" -> "shut down". "count. " (add space). Add a comma after "spec.maxClusterConcurrentShutdown
" and then change "(which in turn defaults to 1)." to "which defaults to 1."
@ankedia, very minor comments and wordsmithing on the description and then I will approve and merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My review comments are the same as Ryan's. It looks like these corrections need to be made through the involved files.
docs/domains/Domain.json
Outdated
@@ -117,6 +117,11 @@ | |||
"description": "Customization affecting Kubernetes Service generated for this WebLogic cluster.", | |||
"$ref": "#/definitions/KubernetesResource" | |||
}, | |||
"maxConcurrentShutdown": { | |||
"description": "The maximum WebLogic Server instances that will shutdown in parallel for this cluster when it is being partially shutdown by lowering its replica count.A value of 0 means there is no limit. Defaults to `spec.maxClusterConcurrentShutdown` (which in turn defaults to 1).", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with all the edits you proposed.
The maximum WebLogic Server instances -> The maximum number of WebLogic Server instances
shutdown -> shut down (globally, shut down (v), shutdown (n) )
Space needed after count.
which in turn defaults to 1 -> which defaults to 1
docs/domains/Domain.json
Outdated
@@ -354,6 +359,11 @@ | |||
"type": "number", | |||
"minimum": 0 | |||
}, | |||
"maxClusterConcurrentShutdown": { | |||
"description": "The default maximum WebLogic Server instances that a cluster will shutdown in parallel when it is being partially shutdown by lowering its replica count. You can override this default on a per cluster basis by setting the cluster\u0027s `maxConcurrentShutdown` attribute. A value of 0 means there is no limit. Defaults to 1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default maximum WebLogic Server instances -> The default maximum number of WebLogic Server instances
shutdown -> shut down
Defaults to 1", -> Defaults to 1.",
docs/domains/Domain.md
Outdated
@@ -34,6 +34,7 @@ The specification of the operation of the WebLogic domain. Required. | |||
| `logHome` | string | The directory in a server's container in which to store the domain, Node Manager, server logs, server *.out, introspector .out, and optionally HTTP access log files if `httpAccessLogInLogHome` is true. Ignored if `logHomeEnabled` is false. | | |||
| `logHomeEnabled` | Boolean | Specifies whether the log home folder is enabled. Defaults to true if `domainHomeSourceType` is PersistentVolume; false, otherwise. | | |||
| `managedServers` | array of [Managed Server](#managed-server) | Lifecycle options for individual Managed Servers, including Java options, environment variables, additional Pod content, and the ability to explicitly start, stop, or restart a named server instance. The `serverName` field of each entry must match a Managed Server that already exists in the WebLogic domain configuration or that matches a dynamic cluster member based on the server template. | | |||
| `maxClusterConcurrentShutdown` | number | The default maximum WebLogic Server instances that a cluster will shutdown in parallel when it is being partially shutdown by lowering its replica count. You can override this default on a per cluster basis by setting the cluster's `maxConcurrentShutdown` attribute. A value of 0 means there is no limit. Defaults to 1 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default maximum WebLogic Server instances -> default maximum number of WebLogic Server instances
shutdown -> shut down (globally)
Defaults to 1-> Defaults to 1.
docs/domains/Domain.md
Outdated
@@ -75,6 +76,7 @@ The current status of the operation of the WebLogic domain. Updated automaticall | |||
| `allowReplicasBelowMinDynClusterSize` | Boolean | Specifies whether the number of running cluster members is allowed to drop below the minimum dynamic cluster size configured in the WebLogic domain configuration. Otherwise, the operator will ensure that the number of running cluster members is not less than the minimum dynamic cluster setting. This setting applies to dynamic clusters only. Defaults to true. | | |||
| `clusterName` | string | The name of the cluster. This value must match the name of a WebLogic cluster already defined in the WebLogic domain configuration. Required. | | |||
| `clusterService` | [Kubernetes Resource](#kubernetes-resource) | Customization affecting Kubernetes Service generated for this WebLogic cluster. | | |||
| `maxConcurrentShutdown` | number | The maximum WebLogic Server instances that will shutdown in parallel for this cluster when it is being partially shutdown by lowering its replica count.A value of 0 means there is no limit. Defaults to `spec.maxClusterConcurrentShutdown` (which in turn defaults to 1). | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The maximum WebLogic Server instances -> The maximum number of WebLogic Server instances
shutdown -> shut down (globally, shut down (v), shutdown (n) )
Space needed after count.
which in turn defaults to 1 -> which defaults to 1
@rjeberhard and @rosemarymarano Thanks for your review comments. I have made changes based on the review comments in fa9a802 and 8f71eaf , please let me know if I missed something. |
* Add Javadoc * Clarify 2.6.0 upgrade instructions (#1818) * Clarify 2.6.0 upgrade instructions * Review comments * Update Javadoc * Fix broken hugo doc relrefs using a link formatting workaround: .../foo.md#bar --> .../foo/_index.md#bar (#1823) * update java url in Dockerfile * Added timeout and debugger to fix hanging issue on kind-new Jenkins jenkins-ignore (#1825) * Mirror introspector log to a rotating file in 'log home' (if configured) (#1827) * Mirror introspector log to a rotating file in 'log home' (if configured) * minor fix * remove comment * doc that logHome includes introspector out * minor doc update * minor doc update * OWLS-81928 - JUnit5: Convert ItManagedCoherence (testCreateCoherenceDomainInImageUsingWdt) test (#1822) * Converted ItManagedCoherence to use JUnit5 jenkins-ignore * Converted ItManagedCoherence to use JUnit5 jenkins-ignore * Removed unnecessary fields in model file jenkins-ignore * Test support: don't update status on replace if defined as subresource (#1830) * Test support: don't update status on replace if defined as subresource * Test support: don't update status on patch if defined as subresource * remove unused method * Support configurable model home (#1828) * initial change * work in progress * Change domain schema * Fix a typo * Minor fix * Unit test fix * Minor doc update * address a review comment * More changes * Minor change * update hashCode/toString/equals, and address review comments * Use kubectl exec to fix a test hanging issue (#1834) * use kubectl exec * cleanup * cleanup extra artifacts for prom and grafana (#1835) * Changes for OWLS-82011 to reflect introspector status in domain status (#1832) * Changes for OWLS-82011 to reflect introspector status in domain status * change method name * Code refactoring * cleanup debug message * Remove unused method * Added check to terminate fiber when intro job fails with BackoffLimitExceeded (which happens when intro job times out due to DeadlineExceeded). * implement review comment suggestions * added unit test for introspector pod phase failed * Added restPort, https tests, extra cleanup for prom and grafana (#1824) * added testcases for https and restport * changed norestport file * fixed dependencies * fixed style * switched to master * disable unittest build * fixed typo * fixed typo * added extra cleaning * added cleanup * addressed all review comments * checkstyle * switched to master branch * switched to new monexp release * added extra clean * added check if clusterrole or clusterrolebinding exists before delete * fixed typo * fixed typos * change change portnumbers to fix parallel run * styling * styling1 * added fix for paralell run * Update chart build * Update the Traefik Version to 2.x (#1814) * Initial check-in for traefik version update * Modified the doc and sample artifacts files * Foramt the document * More doc changes * Renamed setup.sh to setupLoadBalancer.sh to be more specific as per suggestion on OWLS-77806 * Addressed doc comments on PR review * More doc review comments * Minor hypelink name change * More doc changes * Fixed the broken links * Updated copyright inforrmation * Doc review comments * More doc changes * Minor typo * Modified the mii sample wraper scripts * Update Mii Sample script * Resync develop branch. Modified more scripts and yaml files for mii-sample * update MII sample generation/test instructions * Missing modification * Modified traefik ingress syntax * Modifoe md files from docs-source directory * More doc review comment * Fixed the indention issue in inline ingress file * More doc review resolution * Minor doc update * Move the istio istallatiion to RESULTS_ROOT directory Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-2.subnet1ad3phx.devweblogicphx.oraclevcn.com> Co-authored-by: Tom Barnes <tom.barnes@oracle.com> * Add tests to verify loadbalancing with treafik lb for multiple domains (#1831) * Adding utilities for creating Traefik load balancer * Log the assertion errors in IT*.out files * Create ingress resource for traefik * Fix the traefik pod name * do kind repo check * uninstall traefik * Add application building * Deploy application using REST and access it through traefik LB * create secret for WebLogic domain * remove the testwebapp * Print JVMID from clusterview app * Remove the domain creation in PV * cleanup code * cleanup code * remove the domainUid from url * verify loadbalancing after creating 2 domains * cleanup javadocs * Add verification checks to determine host routing is done correctly * fix comments * Add a delay before accessing the app * Fix the curl command * iAddressed the review comments * Fix service name * fix namespace * add more wait time * Adding TLS test * Fix CN * Add https usecase * Encode it as normal String * Fix file names * include -k option to ignore host name verification * Add cert and key as String * wip * use same namespace for traefik * Added more tests * Fix the ingress rules creation command * Fix getNodePort * Access console in a loop * Adding voyager tests * Renamed testclass to be generic for all loadbalancers * order the tests * Add break statement once reponse is found * bind domain names * Adding Junit5 Operator upgrade tests (#1841) * adding operator upgrade tests * adding parameterized test * fix javadoc * adding individual tests * comment out 250 upgrade * change release name * adding order for testing * commenting cleanup to debug * commenting cleanup to debug * adding retry for scale * adding cleanup back * adding individual tests * change test names * check operator image name after upgrade * use 0 for external rest port * adding jira in comments * javadoc changes * exclude upgrade test in full test run * address review comments * Resolution to Jenkin log archive issue (#1845) Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-2.subnet1ad3phx.devweblogicphx.oraclevcn.com> * Update the external client FAQ to discuss 'unknown host' handling, plus... (#1842) * Update the external client FAQ to discuss 'unknown host' handling. Update the domain-resource doc to try make it easier to read. * review and edit * delete extraneous file * minor doc tweaks Co-authored-by: Rosemary Marano <rosemary.marano@oracle.com> * Parameterize tests with different domain home source types (#1776) * parameterize domain type initial commit * cleanup * create domains in initAll * parameterize scale domain tests * cleanup * remove old ItPodsRestart and ItScaleMiiDomainNginx * change spec.ImagePullSecrets for domainInPV * verify pv exists * mv KIND_REPO check before creating domain-in-pv domain * debug wldf on jenkins run * debug wldf on jenkins run 2 * set wldf alarm type to manual * add more wldf debug info * try scale with wldf in different test methods * add more debug info for wldf * debug wldf issue * enable mii domain * add more debug info in scalingAction.sh * add more wait time for debugging * add more debug info in scalingAction.sh * add longer wait time for debug * add domainNameSpace in clusterrolebinding name * enable all tests * debug domaininpv app access in parallel run * add more debug info for domaininpv parallel run * enable verbose for curl command * update JAVA_URL in Dockerfile * address proxy client hanging issue * use httpclient to access sample app * debug domaininpv 404 issue in parallel run * consolidate test classes * address review comments * make admin server routing optional * address Vanaja's review comments * clean up * address Marina's review comments * Fix build * Update Apache doc for helm 3.x (#1847) Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-2.subnet1ad3phx.devweblogicphx.oraclevcn.com> * Revert backofflimit check (#1852) * revert backoff limit check * ignore unit test for backoff limit check * Make sure domainUid is in operator log messages - part 1 (#1844) * Add domainUID to watcher related log messages * Minor changes * Move a common method into KubernetesUtils * minor javadoc update * minor change * revert an unnecessary change * lookup and save internalOperatorkey, if already exists (#1846) * Add WDT & WIT download url options (#1851) * adding wdt/wit download url options * run upgrade tests in separate mvn command * run upgrade tests in separate mvn command * exclude parameterized domain test from parallel runs and include in sequential run * use ! to exclude a test * remove mvn command which runs upgrade test * Disable ItCoherenceTests jenkins-ignore (#1859) * changed WDT release URL for latest release (#1843) * JUnit5 Create Infrastructure for ELK Stack (#1839) * JUnit5 Create Infrastructure for ELK stack jenkins-ignore * JUnit5 Create Infrastructure for ELK stack jenkins-ignore * Changes based on comments jenkins-ignore * Minor change in installAndVerifyOperator jenkins-ignore * Changes based on the comments jenkins-ignore * Corrected a typo jenkins-ignore * Load Balance doc update for SSL termination * Updated the heading font * Modified the hostname :wq * Missing quote * Add domainUID to operator log messages in async call request/response code path (#1856) * Add domainUID to watcher related log messages * Minor changes * Move a common method into KubernetesUtils * minor javadoc update * minor change * revert an unnecessary change * Add domainUid to call requestParams * Work in progress * Work in progress * Work in progress * merge * cleanup * Handle two exception log messages * fix operator external REST http port conflicts in integration tests (#1858) * debug install operator regression * set externalRestHttpsPort to 0 * re-enable lookup method in operator templates * fix NullPointerException in isPodReady method (#1862) * Changes for OWLS-83431 (#1855) * changes for OWLS-83431 * changes for owls-83431 * Chnages to address PR review comments * changes to suppresserror from synchronous call * cleanup changes based on PR comments * changes to fix javadoc and variable name * changes for latest review comments * Namespace management enhancements (#1860) * Work in progress * Work in progess * Work in progress * List, Dedicated working * Update chart build * Use enableClusterRoleBinding * Use lookup * Preserve debugging * Complete label selector * Correct typos * Debugging * More working * Debugging more complicated label selectors * Update chart build * Update chart build * Begin updating samples * Documentation work * Update charts * Complete doc. updates * Additional unit tests * Add additional mementos * Test code to diagnose build failure on Jenkins * More debug code * More debug code * Hopefully fixed unit tests * Review comment * Review comments * Review comments * ItPodTemplates test conversion to Junit5 (#1850) * added tests for pod templates * fixed typo * fixed styling * fixed styling1 * removed junit4 test * removed t3port * fixed domainhome dir * fixed domainhome dir1 * revert to 5ca7d8a704fdfc0b5395b80e327a520b97b33a6e * addressed the review comments * put back Lenny's commit * added more comments * addressed comments, move to use default wdt image * removed mii test * Junit5: Convert two ELK Stack test cases ( testLogLevelSearch and testOperatorLogSearch ) (#1848) * JUnit5 Create Infrastructure for ELK stack jenkins-ignore * JUnit5 Create Infrastructure for ELK stack jenkins-ignore * Changes based on comments jenkins-ignore * Minor change in installAndVerifyOperator jenkins-ignore * Converted two ELK Stack test cases to use JUnit5 jenkins-ignore * Converted two ELK Stack test cases to use JUnit5 jenkins-ignore * Changes based on the comments jenkins-ignore * Added detaied test steps jenkins-ignore * Upgraded ELK Stack to version 7.8.1 and delete old test suites jenkins-ignore * Verify fields that cause servers to be restarted (#1866) * First cut for pod restart * keep the change in ItDomainInImageWdt * move the test to ItPodsRestart * revert some ealier change * Move the field change verification right after patch domain * remove the order * minor change * address the review comments * minor change * Debugcdttest (#1863) * debug cdt test * printing exec stdout and stderr before assertion Co-authored-by: BHAVANI.RAVICHANDRAN@ORACLE.COM <bravicha@bravicha-1.subnet1ad1phx.devweblogicphx.oraclevcn.com> * addressed review comments * remove duplicate parameter srcstorepass in keytool command (#1867) Co-authored-by: Johnny Shum <cbdream99@gmail.com> * Kubernetes Exec API intermittently hangs when executing curl command for ItMiiDomain test (#1871) * change to reproduce hang * additional debug to Kubernetes.exec * enable verbose of curl command * read stderr * log exec result * add max-time to curl command * fix curl command option * enable detailed trace * Add --max-time flag * Check exit code 7 and sleep 10 seconds * More debug info when calling checkAppIsRunning * Add thread info to log messages * checkstyle fixes to debug info * Add debug for retry * Update thread info for log messages * addition debug statements along exec call path * Fix NPE * Try programmatic thread dump to debug exec * Debug AppIsRunning awaitility * Read error stream in separate thread * Remove some debug and join without timeout * revert ItMiiDomain.java * Revert TestActions.java * Remove debug info * modify test to use appAccessibleInPod method * refactor stream readers * Addreseed new doc review comments * Owls 83534 - Changes to allow setting nodeAffinity and nodeSelector values in operator Helm chart (#1869) * changes to allow setting nodeAffinity and nodeSelector values in operator Helm chart and related doc change * minor doc change * changes based on review comments * Make tests fail fast when initialization fails (#1874) * check initialization success * modified log message * modified log message * log rotation enhancements and doc (#1872) * Add log rotation for NM .out/.log, enhance log rotation for introspector .out, and document log rotation for WLS .out and .log. * minor doc edits * Update charts * Owls 83538 (#1876) * Eliminate http call from Watcher tests * cleanup, remove unit test thread dependencies * Backout change to chart * minor cleanup Co-authored-by: Ryan Eberhard <ryan.eberhard@oracle.com> * Adding assertions in upgrade test (#1880) * adding missing assert * adding assert for uninstall operator * Add a test to change WebLogic credentials (#1853) * Adding test to change the admin credentials for domain running in PV * Fx the test name * fix secret name * add restartVersion instead of replace * fix json * change the method order * fix managed server names * Fix assertionerror class for invalid login * Fix comments * Address review comments * use default channel port for accessing application * Lookup domain runtime only if it is admin server * change max-message-size to a large value change the t3channel port to some arbitrary number * fix the max-message-size * log response from managed servers * increasing the max iterations * Fix the JAVA_OPTIONS * Fix the java options * Add debug flags to servers * remove the system property maxmessagesize * Change the implmentation of cluster communication check * fix iterations * fix replicaCount * Adding more debug messages * Fix server names * Refactor the server health checks * fix comment * Fix the server count * Enable cmo.setResolveDNSName(true) for custom nap * log dns resolv.conf file as well * Add more debug flags * Add a random objects to JNDI tree * Fix dns logging * Use MBean server connection instead of heartbeats to detect server health * Fix the curl request url * Refactored the server communication verification by MBeanServerConnection to the individual servers instead of relying on cluster heartbeats * Fix the URL * Remove DNS entries logging * Undoing the changes for ItLoadbalancer.java * Checking if the server is running * Deleted ItLoadbalancer.java * Deploy application before accessing it * PR to add sample running Oracle WLS Kubernetes Operator on Azure Kubernetes Service. Thanks to Johnny Shum, Ryan Eberhart and Monica Ricelli. Merge from branch created for https://github.com/oracle/weblogic-kubernetes-operator/pull/1804 Update _index.md On branch edburns-msft-180-01-wls-aks modified: docs-source/content/samples/simple/azure-kubernetes-service/_index.md - Version numbers in prerequisites. - Additional "Successful output looks like" blocks. - When a code block defines an env var, export it. - Before running the script to create the yaml, rm -rf ~/azure. modified: kubernetes/samples/scripts/create-kuberetes-secrets/create-azure-storage-credentials-secret.sh modified: kubernetes/samples/scripts/create-kuberetes-secrets/create-docker-credentials-secret.sh - chmod ugo+x modified: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/create-domain-on-aks-inputs.yaml - Readability. - Move the "prefix" stuff to the "must change" section. On branch edburns-msft-180-01-wls-aks modified: docs-source/content/samples/simple/azure-kubernetes-service/_index.md - Working toward 1163875 Apply disambiguation prefix on additional items. modified: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/create-domain-on-aks.sh - Correct spelling error in comment. Task 1163875: Apply disambiguation prefix on additional items Changes after reviewing commit 705ab338ab4647c2af963ed58b74872d4fb1de6b with Ed. Fix check points and check length of namePrefix. Create validate.sh to validate resources before creating domain manually. Typos On branch edburns-msft-180-01-wls-aks typos modified: docs-source/content/samples/simple/azure-kubernetes-service/_index.md On branch edburns-msft-180-01-wls-aks modified: docs-source/content/samples/simple/azure-kubernetes-service/_index.md - Spelling. - Additional validation: kubectl logs -f. - Mention health checks. modified: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/create-domain-on-aks.sh - Make it so the script can be run from an absolute path. modified: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/validate.sh - chmod ugo+x Modified in kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/create-domain-on-aks-inputs.yaml Update _index.md and all related samples script and yaml files to remove all mention of Docker Hub Modified in kubernetes/samples/scripts/create-kuberetes-secrets/create-docker-credentials-secret.sh Update dockerServer=container-registry.oracle.com On branch edburns-msft-180-01-wls-aks modified: docs-source/content/samples/simple/azure-kubernetes-service/_index.md - Fix link to GET IMAGES. - Fix lower case l. - Update heading. - Correct wording. - Give hint about ImagePullBackoff. modified: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/create-domain-on-aks-inputs.yaml - Adjust comments to make it clear that it's Oracle SSO credentials. modified: kubernetes/samples/scripts/create-weblogic-domain/domain-home-on-pv/create-domain.sh - Increased retries to 30. On branch edburns-msft-180-01-wls-aks modified: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/azure-file-pv-template.yaml modified: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/azure-file-pvc-template.yaml - Changes suggested by Johnny Shum to get past the cluster distribution problem. Revert "On branch edburns-msft-180-01-wls-aks" This reverts commit b52b466ab5b8eb2a7493e829e125be876dc516a1. Name vp/pvc, file share with unique name. Add testwebapp.war for testing. Modified in kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/create-domain-on-aks.sh Change file share name with "prefix-weblogic-time" Change pv, pvc name with "prefix-azurefile-time" Output status during waiting for job completed. Modified in docs-source/content/samples/simple/azure-kubernetes-service/_index.md Update text with pv/pvc, file share unique name. Modified in kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/create-domain-on-aks-inputs.yaml Change name structure of pvc and file share. Modified in kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/validate.sh Fix validate.sh with pv/pvc, file share unique name. On branch edburns-msft-180-02-wls-aks forward slashes only. modified: docs-source/content/samples/simple/azure-kubernetes-service/_index.md On branch edburns-msft-180-02-wls-aks Verified manual execution of steps works on Oracle Enterprise Java subscription. modified: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/azure-file-pv-template.yaml modified: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/azure-file-pvc-template.yaml - increase capacity to 10Gi. - Set on pv: ``` persistentVolumeReclaimPolicy: Retain ``` - Remove nobrl. - Set on pvc: + selector: + matchLabels: + usage: %PERSISTENT_VOLUME_CLAIM_NAME% On branch edburns-msft-180-02-wls-aks In table for automation, update description for docker related parameters. modified: docs-source/content/samples/simple/azure-kubernetes-service/_index.md * Update _index.md Address comments from @rosemarymarano. * Update README.md Address @rosemarymarano comment. * Update _index.md * On branch edburns-msft-180-02-wls-aks modified: docs-source/content/samples/simple/azure-kubernetes-service/_index.md - Copyedits. - Remove ClusterRoleBinding modified: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/create-domain-on-aks.sh - Remove ClusterRoleBinding * On branch edburns-msft-180-02-wls-aks deleted: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/testwebapp.war - "security policy doesn't let us merge changes with non-image binary files." - This deleted file has the same checksum as `kubernetes/samples/charts/application/testwebapp.war` so let's just use that. modified: docs-source/content/samples/simple/azure-kubernetes-service/_index.md - Use `kubernetes/samples/charts/application/testwebapp.war` * Change default VMSize and node number, as Standard_D4s_v3 and 3 node exceed quota on free azure account. Modified on docs-source/content/samples/simple/azure-kubernetes-service/_index.md Change VM size to Standard_D4s_v3 and node number to 2 in document. Modified in kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/create-domain-on-aks-inputs.yaml Change default value of VM size to Standard_D4s_v3 and node number to 2. Tested in Oracle Enterprise Java and a free azure account. * On branch edburns-msft-180-02-wls-aks Add Clean Up section. modified: docs-source/content/samples/simple/azure-kubernetes-service/_index.md modified: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/create-domain-on-aks.sh * On branch edburns-msft-180-02-wls-aks Apply changes suggested by @mriccell. modified: docs-source/content/samples/simple/azure-kubernetes-service/_index.md modified: kubernetes/samples/scripts/create-weblogic-domain-on-azure-kubernetes-service/create-domain-on-aks.sh * Consolidate multiple Mii test classes to a single ItClass (#1875) * Consolidate Mii Domains and remove junit4 tests * removed more Junit4 Mii tests * Modify the logic to check SystemResources * Updated the initAll() with installAndVerifyOperator replacing the old code * Modify the assertion for delete resources * Adding upgrade test for 3.0.1 to latest(develop) (#1882) * adding missing assert * adding assert for uninstall operator * adding upgrade test from 3.0.1 * convert testTwoDomainsManagedByOneOperatorSharingPV in ItOperatorTwoDomains.java to Junit5 (#1849) * convert test from junit4 to junit5 * change deleteJob to use GenericKubernetes API * remove ItDomainInPV * remove junit4 test ItOperatorTwoDomains.java * add loadbalancer tests * revert lookup in _operator-secret.tpl * remove ItLoadBalancer.java ItVouagerSample.java * re-enable lookup method in operator templates * add more debug info * get the clusterview from credential-change-pv-domain branch * collect logs for default namespace * fix intermittent issue in ClusterView App * add retry in admin console login * use unique domainuid * get new clusterview app * use non-default namespace * fix ItSimpleValidation PV hostPath * remove ItSimpleValidation * get the latest clusterview app * Retry docker login, image push/pull from/to repos (#1885) * Using standard retry logic to retry the build * Use retry for docker login and image pull and push * increase timeout to 30 minutes * Retry the push * wip * use iterator to push * check SKIP_BASIC_IMAGE_BUILD is set before pushing it to repo * Bring back the test jenkins-ignore (#1883) * add NGINX for production ICs (#1878) * Verify shutdown rules when shutdown options are defined at different domain levels (#1870) * test for ItPodsShutdown * remote JUnit4 ItPodsShutdown * address the review comment * refactor the code * add javadoc * rename the test class name as ItPodsShutdownOption * Modified py scripts * correct the errors (#1887) * Adding operator restart use cases from Junit4 tests (#1884) * adding operator pod restart tests from Junit4 * renaming file * code refactor * code refactor * log exception * code refactor * delete Junit4 test class * fix refactored method * deleting Junit4 test classes which are converted * addressing review comments * fix logic to wait for rolling restart to start in existing test * address review comments * fix log message * Created infra for WLS Logging Exporter and converted related tests to use JUnit5 (#1877) * Created infra for WLS Logging Exporter and converted related tests to use JUnit5 jenkins-ignore * upgraded ELK Stack back to v7 and fixed a filter issue in RESTful API to query Elasticsearch repos jenkins-ignore * Changes based on comments jenkins-ignore * More doc review change * Converted Usability Tests to Junit5 (#1888) * added usability tests * fixed some typos * added cleanup * fixed update * fixed upgrade test * cherrypick fix from Maggie * removed old class * merge from develop * merge * addressed review comments * addressed more comments * addressed comments from Pani * addressed comments from Pani1 * styling * corrected java doc * corrected java doc with typo * create test infrastructure for Apache load balancer (#1891) * initial commit * apache lb * add apache tests to ItTwoDomainsLoadBalancers * fix apache-webtier chart imagePullSecret * get the ItPodsShutdownOption from shutdown3 branch * change imagePullPolicy to Always * push the apache image to Kind repo * add debug info for REPO_REGISTRY etc in Kind new * remoe pull image from ocir and push to kind repo * enable pull apache image from ocir and push to kind new * cleanup * move OCR login to test * address Vanaja's comments * address Pani's comments * address Vanaja's comments Co-authored-by: vanajamukkara <vanajakshi.mukkara@oracle.com> * add debug info in scaling cluster with WLDF methods (#1893) * add debug infor in scaling cluster with WLDF methods * address Vanaja's comments * owls-83918 an idle domain's resource version should stay unchanged (#1879) * In recheck code path only continue if spec changes * Minor change * only add progressing condition when something is really happening to the domain * cleanup * populate state and healht when needed * Work in progress * debug * debug * more debugging * remove debugging * remove debugging * cleanup * Fix patchPod handling and change HashMap to ConcurrentHashMap * Add ProgressingStep on scaling down * Address review comments * Review comment * sample changes needed to work in openshift with default scc * Adding WLDF and JMS system override tests (#1896) * Adding situational config overrides tests for JMS and WLDF resources * Remove clusterview app and DB * Fix file name * Add sitconfig application * fix url * wip * Wait until response code is 200 * fix response string * change wait time to 5 min * Improve comments and javadoc * Removing old tests * Use Kubernetes Java client 9.0.2 (#1898) * Uptake k8s Java client 9.0.1 * Review corrections * Use new version of GenericKubernetesApi * Changes for OWLS-83136 - Limit concurrent pod shutdowns during a cluster shrink (#1892) * changes for OWLS-83136 - Limit concurrent pod shutdowns during a cluster shrink * Minor code cleanup for OWLS-83136 * minor change to avoid duplicate step * fixed javadoc for deletePodAsyncWithRetryStrategy method * fix for integration test and added maxConcurrentShutdown in index.html * fix unit test failure and shutdown servers concurrently when domain serverStartPolicy is NEVER * Address PR review comments. * Changes to address PR review comments. * Changes to address PR review comments. * Changes to address PR review comments * Resolve merge conflict * Enable unit tests * Use Dongbo's change in unit test Co-authored-by: Ryan Eberhard <ryan.eberhard@oracle.com> * Added sample support for Ngnix Load Balancer (#1886) * Initial check-in * Updated doc with SSL termination * Review comments on nginx/README.md * Review comments * More review comments * Removed the Path Routing Section * More doc review change * Minor doc modification * More review comments * Remove path routing yaml file * Update README.md * Update README.md * Update setupLoadBalancer.sh Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-1.subnet1ad2phx.devweblogicphx.oraclevcn.com> Co-authored-by: Ryan Eberhard <ryan.eberhard@oracle.com> * New Tests for ServerStartPolicy (#1895) * Added new tests for ServerStartPolicy * Minor typo modification * Adddressed review comments * Added logic to check the managed server timestamp * Resolved few typos * Resolved more review comments, removed duplicate codes, used common utility methods * More review comments Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-2.subnet1ad3phx.devweblogicphx.oraclevcn.com> * fix build application not to assert exec false exit value (#1904) * fix build application not to assert exec false exit valuae * Asserting that file is available after build Co-authored-by: BHAVANI.RAVICHANDRAN@ORACLE.COM <bravicha@bravicha-1.subnet1ad1phx.devweblogicphx.oraclevcn.com> Co-authored-by: sankar <sankarpn@gmail.com> * temporarily adding chown option to imagetool command (#1905) * Adding test to use PV for logHome in MII domain (#1903) * logs on PV * use PV for logs * Revert to develop branch code * fix pv path * code refactor * fixing path * fix comments * look for string RUNNING in server log * modify success/failure criteria * fix indentation * adding pipefail * ItInitContainers conversion to Junit5 (#1907) * added testcase for initcontainers * added testcase for initcontainers1 * fixed init check * addressed review comments * addressed the comments * sync to develop branch * Added annotation to remove client header on Traefik Ingress * add NGINX path routing doc * List continuation and watch bookmark support (#1881) * Allow watch bookmarks * Work in progress * Work in progress * Work in progress * Work in progress * AsyncRequestStep tests * Fix tests * Bookmark tests * Clarify names * Work in progress * Better support for namespace lists spanning REST calls * Revert changes to charts * Bug fixes * Disassociate CRD creation from namespace startup * Make unit-test more generic * Clarify continue pattern * Save continue for reinvoke of async request * Correct method name * Add security disclaimer statements * fixing the mii domain and sample test after images have been updated (#1906) * fixing the mii domain and sample test after images have been updated * Modified group name used to copy file to server pod jenkins-ignore * fix jrf test in mii samples not to change gid to root * adding the fix I had removed by mistake * update after Tom's review comments Co-authored-by: BHAVANI.RAVICHANDRAN@ORACLE.COM <bravicha@bravicha-1.subnet1ad1phx.devweblogicphx.oraclevcn.com> Co-authored-by: huiling_zhao <huiling.zhao@oracle.com> * Added doc to eliminate client proxy header * Added nginx link * added ngnix ref * update doc with review comments * Change REST API query to handle hyphen, WebLogic Logging Exporter (#1897) * Change REST API query to handle hyphen, WebLogic Logging Exporter jenkins-ignore * Removed some comments jenkins-ignore * Syncup with latest develop branch jenkins-ignore * Merge with develop Co-authored-by: Ryan Eberhard <ryan.eberhard@oracle.com> * OWLS-84517: Scaling failed when setting Dedicated to true (#1921) * Rest authentication of requests should use namespace scope for Dedicated selection strategy * remove System.out.println from unit test * Retry failure fix (#1854) * update from develop * update from develop * change log message * Initial check in for JRF Fatal error fix * remove test for now * Fix logic * add info for create jrf domain and remove obsolete constants * add comment * relax retry and update comments * change comments * change info text * minor text change * Add retry count to domain status and support retryCountMax * increment retry count only if there is an error message * doc update * use existing error for increment * Log message change * rename retryCount to introspectJobFailureRetryCount * Add logging for retry counter * Fix NPE in unit test * Minor refactoring * refactor code * update description of field * changing description texts * missed files * change "MII Fatal Error" to "FatalIntrospectorError" * default failure retry count should be 0 * change comparing failure retry count gte * rename introspectJobFailureRetryCount field, reset counter if succcessful, and log before retry. * Internationalize message and move logging to start of retry * update message * update message text Co-authored-by: Johnny Shum <cbdream99@gmail.com> Co-authored-by: Ryan Eberhard <ryan.eberhard@oracle.com> * Configure JMS Server logs on PV and verify (#1924) * configure jms server logs on PV and verify * undo jms logs on pv * fixing rolling restart assertions * OWLS-80038 and OWLS-80090: fix mountPath validation and token substitution (#1911) * Skip volume mount path validation if it contains valid tokens * Check admin serverPod's additional volume mount paths too in domain validation * Check cluster serverPod's additional mount paths too in domain validation * check mountpath validation after token substitution is performed * cleanup * fine tuning * minor cleanup * In progress * WIP * minor change * Modify a test name * Pod shutdown tests porting (#1914) * Fix the shutdown options * Adding shutdown option tests * Fix log location * fix shutdown object assignment * Add ind ms * fix start policy * wip * wip * wip * wip * wip * Add tests without env override * add debug messages * refactor code * check for in ms 2 * Fix pod name * fix pod name * Cleaned up comments and updated javadocs * Removed JUnit4 test class * Restore updatedomainconfig it class * fix the yaml formatting * Remove left over files * update comments * fix array size * address review comments from Vanaja * remove throws clause * fix javadoc * replace StepTest with FiberTest, remove unused Fiber code (#1926) * Correct Ingress documentation (#1923) * remove outdated sample doc references to weblogicImagePullSecretName (use imagePullSecretName) (#1920) * Correct enableClusterRoleBinding * Removal of Junit4 Integration tests (#1934) * Removal of old Junit4 test * Remove reference to junit4 integration test from pom.xml * Remove ref to junit4 test from buildtime-reports Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-1.subnet1ad2phx.devweblogicphx.oraclevcn.com> Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-2.subnet1ad3phx.devweblogicphx.oraclevcn.com> * OWLS-84141 fix operator upgrade issues related to domainNamespaceSelectionStrategy (#1930) * Fix operator helm upgrade issues related to domainnamespaceSelectionStrategy * minor fixes * minor change * make helm chart behavior matches what the operator runtime has * Owls83813 fix a scaling issue when upgrade from 2.5.0 (#1933) * comment out workaround for upgrade from 2.5.0 * Fix NPE and remove workaround in test * cleanup * minor modification * MII jrf - improve wallet password handling (#1919) * Improve opss wallet password handling. * miijrf-remove-pwd-echo: add comments, tracing, doc fixes, and two fixes. * fix comment Co-authored-by: Ryan Eberhard <ryan.eberhard@oracle.com> * Mob refactoring (#1929) * Expand unit test coverage * startPodWatcher * more watcher methods * ConfigMapAfter refactor * WIP * finish start watcher methods * resolve merge conflicts * DomainPresenceInfos and DefaultResponseStep changes * isolated domain presence info map * refactoring changes * PodListStep refactor * generify * refactoring * move common methods up * refactoring and bug fix * Add namespace to DomainPresenceInfos * refactoring WIP * fix checkstyle errors * refactoring changes * start refactor to biconsumer * continue processList refactor * continue refactoring * refactor PodListStep * implementing NamespaceProcessor * refactor readExistingResources to eliminate duplication * Add refactorings from previous chunked list attempt * add files not previously added Co-authored-by: Lenny Phan <lenny.phan@oracle.com> Co-authored-by: Dongbo Xiao <dongbo.xiao@oracle.com> Co-authored-by: ANIL.KEDIA@ORACLE.COM <anil.kedia@oracle.com> * Rename new-integration-test directory (#1935) Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-2.subnet1ad3phx.devweblogicphx.oraclevcn.com> * wait till admin pod has restarted instead of pod deleted (#1938) * check existence of sa in helm templates (#1939) * Kubernetes Java Client 10.0.0 (#1937) * Kubernetes Java Client 10.0.0 * Rebuild charts * Update charts * Revert "check existence of sa in helm templates (#1939)" This reverts commit fcfd8855fc50759d3cc780c51757df424b27c235. * Revert "Kubernetes Java Client 10.0.0 (#1937)" This reverts commit 8c9d57208938e055fa6eb79156fb6f590292fa6f. * Supercedes #1900 (#1940) * Fix text parsing issue modified: utility.sh Fix text parsing error on Ubuntu 18.04.5 LTS modified: validate.sh Fix yaml parsing error on MacOS. Wait for resource ready on MacOS and remove code for helm less than 3.0 Add text for domain status troubleshooting Use tag that includes AKS docs. Per Reza, inline OCR authentication Update _index.md Update create-domain-on-aks-inputs.yaml Update _index.md Update _index.md Update _index.md Update _index.md Update _index.md Update _index.md Remove UNIX Update _index.md Use AKS addon name Use alias as Rosemary suggested in our last PR. * Update _index.md * Update _index.md Co-authored-by: Galia <haixia.cheng@mircosoft.com> Co-authored-by: Ryan Eberhard <ryan.eberhard@oracle.com> * Model and application archives In persistent persistent volume (#1936) * Adding tests for model in pv for MII domain * add docker login * use oracle image instead of wls image * wip * fix wdt model file location * fix the model home to point to a directory rather than a file * add a application archive * fix model file name * wip * create different directories for application and model file * wip * verify servers health * fix checkstyle violations * add admin-server in app target * fix user names * moved the tests to integration-tests directory * address Pani's comments * use CommonMiiTestUtils.createDomainResource * add pv to server pod * fix javadoc * add public doc in test method javadoc * Mii doc: update runtime update doc (#1942) * Improve opss wallet password handling. * miijrf-remove-pwd-echo: add comments, tracing, doc fixes, and two fixes. * fix comment * Update MII runtime update doc, including documenting embedded LDAP and credential mapping runtime updates as unsupported. * review and edit Co-authored-by: Rosemary Marano <rosemary.marano@oracle.com> * Owls 84594 (#1941) * Test for reproducing bug owls-84594 * clean up cluster comparison * Fix checkstyle issues * Verify logs from ms1 * fix the expected ignore sessions attribute for ms1 * Simplify unit tests Co-authored-by: sankar <sankarpn@gmail.com> * Integration test for secure nodeport service (#1931) * Initial check-in * Review comments * Sync up develop branch * Review comments resolution * Add check the availability of exteranal service * Resolve more review comments * Consolidate the test doamin and rename the class * Fixed few typos Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-2.subnet1ad3phx.devweblogicphx.oraclevcn.com> Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-1.subnet1ad2phx.devweblogicphx.oraclevcn.com> * document altering WebLogic Server CLASSPATH and PRE_CLASSPATH (#1948) * document altering WebLogic Server CLASSPATH and PRE_CLASSPATH * minor doc edit * Modified doc to remove client headers * Updated doc/utility to download custom version of Load Balancer release * fix broken link (#1952) * added synchronized to execCommand to fix intermitten failures (#1947) * change domain name to be different than the one used in other IT tests (#1943) * fixed cleanup order to uninstall operator before cleaning sa to fix intermittent failures (#1946) * fixed cleanup order to uninstall operator before cleaning sa * replaced hardcoded secret name with var * corrected secret name * style * Add TLS and Path Routing Tests for Nginx, Voyager and Traefik (#1910) * add NGINX tls and two domains tests * add tls ingress for Voyager * add path routing for three lbs * fix nginx service name in other tests * cleanup * use NGINX chart version 2.16.0 * uninstall nginx first in AfterAll * add stable repo for Prometheus * address Pani's comments * Added Upgrade Test from Release 3.0.2 (#1956) Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-2.subnet1ad3phx.devweblogicphx.oraclevcn.com> * Support for persistentVolumeClaimName, support for both non privilege… (#1916) * Support for persistentVolumeClaimName, support for both non privileged port 8080 to listen and default priviledged port, imagePullSecrets and readme updates * Incorporated doc review comments * Owls-84294 handle missing sa in operator install/upgrade (#1957) * check existence of sa in helm templates * minor product change plus test changes * minor changes * more changes * more doc change * minor doc change * fine tune doc text * add error messagewq * minor doc edit * enable disabled test cases (#1949) * Fix missing ODL configuration that may presents in the model (#1950) * Fix missing ODL configuration that may presents in the model * refine archive for fmw logging.xml Co-authored-by: Johnny Shum <cbdream99@gmail.com> * Unit test and fix detection of stranded namespaces (#1953) * Unit test and fix detection of stranded namespaces * Some code simplification * put some code back where it started from * Correct method name * Cache compiled pattern, use explicit constant for call limit. * Fix retry regression (#1954) * Fix retry regression * simplify reset failurecount in podstepcontext Co-authored-by: Johnny Shum <cbdream99@gmail.com> * Develop owls 84334 (#1902) * support domain's secure mode * Use the domain's administration port if the server's admin port is 0 * - Fix the NPE when the WLS pod's listen port passed to Prometheus annotations is null - Derive the defaults for a few other MBean attributes that depend on the domain secure mode * Fix error in getNAPProtocol for ServerTemplate * Adding test to verify image update for WLS domain (#1959) * first cut for image update * minor change * refactor the code after syncing to the latest develop branch * edit the log message * address the review comments * remove the extra line * disable ItTwoDomainsLoadBalancers.testApacheLoadBalancingCustomSample (#1962) * Adding flexibility to integration tests to pull the base images from OCIR or OCR and more (#1951) * use secret based on base images repo * fix secrets * fix secrets * fix more secrets * more secret fixes * fixing image name * fix compilation errors * fix condition for exec * fix merge conflict * fix log messages in ItServerStartPolicy * some more refactoring * refactoring -move useSecret and deriving base image name logic to TestConstants * fix checkstyle * fix domain images repo for multi node cluster * set image pull secret always * fix domain images repo * fix indentation * change REPO env var to OCIR * fixed grafana install (#1965) * Add a testcase for -wdtModelHome option to the imagetool (#1961) * add wdtModelHome parameter * Adding testcase for custom wdt model home * fix model home * fix wdtmodelhome location * remove @Test annotation * log domain uid and image * fix image name * use wls pod for pv manipulation * change pv name * wip * change pv permission to oracle:root * use variable to store location model home * wip * add modelfile to the image * supply modelfile in the image building process * fix image push * fix comments and javadocs * fix log message * fix image check * list images * fix image name * address review comments * add the pod exec and copy commands to common util file * use file util from FileUtils * fix default OCR image names (#1968) * Add Tests for DataHome Override in Domain Custom Resource (#1964) * add tests for dataHome override * address Pani's comments * re-enable ItTwoDomainsLoadBalancers.testApacheLoadBalancingCustomSample (#1969) * initial commit for apache custom sample update * re-enable ItTwoDomainsLoadBalancers.testApacheLoadBalancingCustomSample * address Vanaja's comments * Add ClientFactoryStub memento to watcher tests (#1967) * OWLS-84786 - Use Kubernetes Java Client 10.0.0 (#1972) * Second attempt at using the Kubernetes Java Client 10.0.0 This reverts commit 9a33d3bf4f69f48470205a980322eedf867105fe. * Diagnostics * Remove duplicated dependencies * More diagnostics * Update charts * changes for OWLS-84786 * removed diagnostics messages and code cleanup for OWLS-84786 * minor cleanup * Back out changes to docs/charts dir. Co-authored-by: Ryan Eberhard <ryan.eberhard@oracle.com> * Added integration test cases for Dedicated namespace scenarios (#1913) * Added integration test cases for Dedicated namespace scenarios jenkins-ignore * Syncup with JIRA OWLS-84517 and changes based on comments jenkins-ignore * Delete index.yaml * Remove binary files that are wrongly checked in jenkins-ignore * Changes based on the comments jenkins-ignore * Syncup with develop branch jenkins-ignore * Changed javadoc jenkins-ignore * Syncup w develop and changes based on the comments jenkins-ignore * Added creating CRD jenkins-ignore * Syncup with develop branch jenkins-ignore * Syncup with develop branch jenkins-ignore * Added this test suite to sequential run only jenkins-ignore * Fixed an error in kindtest.sh jenkins-ignore * OWLS-84562 - added tests for Namespace management enhancements (#1955) * added tests * updated test loc * more tests * fixed default secrests management * fixed test logic * corrected java docs * fixed domainns * fixed secret creation * fixed secret dependencies * fixed default domain crd dependencies * fixed check pod creation * style * added rbac test * added rbac test, corrected ns label * addressed comments from review * addressed comments from review1 * addressed comments from review2 * addressed comments from review3 * addressed more comments * style * removed commented out methods * addressed review comments5 * correct image name construction of DB and FMW (#1974) * Moving SOA deployment samples to a different repo (#1973) * removed -t flag for create and drop scripts * Deleted SOA deployment samples and updated README pointing to SOA external repo * rephrased soa doc reference * Reduce job delete timeout value. (#1978) * add retry when scaling cluster with WLDF (#1986) * add retry when scale cluster with WLDF * cleanup * Added automation to test StickySession using latest Traefik Version (#1976) * Added automation for StickySession using latest Traefik Version jenkins-ignore * Corrected a typo jenkins-ignore * Improve nightly stability for ItIntrospectVersion.testUpdateImageName (#1988) * improve nightly stability * address review comment * copy out scalingAction.log from admin pod when scaling cluster with WLDF (#1989) * copy scalingAction.sh from admin server pod * add retry when copyfile from pod * cleanup * cleanup * update javadoc * add copyFileFromPod to FileUtils * address Lenny's comments * OWLS-84881 better handle long domainUID (#1979) * add limits to generated resource names * work in progress * get server and cluster info from introspector's results * new configuration, validation, and unit tests * minor change * fix operator chart validation * fix chart validation * fix hard-coded external service name suffix in integration tests * fix more integration test failures * clean up * minor doc fixwq * Revert "minor doc fixwq" This reverts commit 56b5656e82acd01c5602fc0293312b749258a7c4. * minor doc fix * minor fix to test * clean up test constants * minor doc edits based on review comments * improve error msg and remove hard-coded suffixes from doc and samples * cleanup * change the legal checks based on Ryan's suggestion * add cluster size padding validation * minor changes to helm validation * only reserve spaces for cluster size wqsmaller than 100 * one more unit test case * minor changes * minor doc change * change a method name * Minor doc update on upgrading operator using helm upgrade (#1994) * helm upgrade to new operator image should be issue from same github repository branch * minor edits * Reduce jrfmii map size (#1980) * remove em.ear and sysman/log/EmSetup*.log backup_config to reduce configmap size * compress merged model before encryption to reduce size * update show modelscript * minor changes Co-authored-by: Johnny Shum <cbdream99@gmail.com> * use old default suffix when running older release (#1995) * cross domain transaction recovery (#1993) * test for crossdomaintransaction with TMAfterTLogBeforeCommitExit * update image name * update after Alex's review comment and develop * update after review comments * fix checkstyle violation Co-authored-by: BHAVANI.RAVICHANDRAN@ORACLE.COM <bravicha@bravicha-1.subnet1ad1phx.devweblogicphx.oraclevcn.com> * External JMS client with LoadBalancer Tunneling (#1975) * Added test for external JMS client * separate methods for http and https tunneling * Modified the test method name * Added ssl debug * Modify keytool command line args * Added SAN extension to the ssl cert * Resolved typo in openssl command * Modified K8S_NODEPORT_HOST to return IP Address * Review comments (a) modified method scope to private (b) usage of utility method to copy files from Pod * Addressed more review comments * Added more description and modified the assettion to check command line execution Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-2.subnet1ad3phx.devweblogicphx.oraclevcn.com> * OWLS-84660 document the new resource name Helm configurations and resource name limits (#1997) * add doc for length limits to resource names * more doc changes * more changes to the doc * fix reference links * changing some of the wording * minor fixes * move the main new section from domain-resource.md to its parent _index.md * minor changes * more edits * more doc edits * more edits * address review comments * minor change * address more review comments * minor edit * Refactoring: extract domain namespaces code from main (#1999) * Extract common domain namespace processing * Ensure script config map exists when namespace started * Address race condition in tuning parameters instantiation * Use function to get current namespace list * Remove obsolete code * Automate domain in pv samples (#1996) * Adding samples * wip * fix file paths * correct pv pvc name * wip * fix managed server namebase * delete pvc * change the pv reclaim policy to recycle * fix ms base name * fix domain namespace in service check * parameterize test to use wlst and wdt * fix test type * add javadoc * wip * delete domain and verify it is removed * change domain name * create credentials secret for each domain * add t3publicaddress in input file * wip * Fix javadocs and comments * delete pv and pvc and wait for it to terminate * wip * wip * address review comments * Add image secrets * Change the domain name to be unique (#2005) * Fix ENV variable setting in JRF domain in PV test class (#1998) * debug JAVA_HOME issue for fmw image * fix hard coded env var * debugging * debugging * refactor the code * delete the extra space line * address the review comment * address the review comment * address the review comment * synchronize startOracleDB method (#2006) * add tests for terminating SSL at LB to access console and servers (#1977) * add tests for terminating ssl at LB to access console and servers * address Pani's comments * fix traefik annotation * add header in traefik ingress rules * cleanup * remove if __name__=main clause in python script (#2004) * Add instructions for creating a custom Security Context Constraint (#2003) * update openshift security docs to include custom scc instructions Signed-off-by: Mark Nelson <mark.x.nelson@oracle.com> * update based on review comments Signed-off-by: Mark Nelson <mark.x.nelson@oracle.com> * updates after review with ryan Signed-off-by: Mark Nelson <mark.x.nelson@oracle.com> * updates after review with ryan Signed-off-by: Mark Nelson <mark.x.nelson@oracle.com> * OWLS 84741: Scaling failed on Jenkins when setting Dedicated to true & io.kubernetes.client.openapi.ApiException: Not Found (#1990) * Use REST client's access token for authentication and authorization * Enable testDedicatedModeSameNamespaceScale * Add patch permissions to rolebinding * Code cleanup * Changes from initial code review * Use TuningParameters to acccess property to control Operator's REST API authentiction and authorization implementation * Code review changes * documentation updates * document patch verb * documentation changes based on review * use code font for appropriate parameters * Owls 84815 (#2009) * Remove dependency of job processing on DomainNamespaces class * Add unit tests for Namespace watcher * work in progress * start converting main to use instance methods * Extract operator startup into instance method * Define K8s version in main delegate * refactoring: move fullRecheckFlag out of Namespaces * Refactoring: extract Namespaces class * test for ability to list domains when dedicated namespace strategy * Handle null value for watch tuning in unit tests * Correct chart * Update test dependencies and POM and change Dockerfile to use JDK 15 (#2008) * OWLS85461 add introspect version to server pod label (#2012) * initial changes to add introspect version to pod labels * work in progress * work in progress * cleanup * fix unit test failure in PodWatcherTest not related to this PR * minor changes * doc changes * add an example in the doc * doc edits to address review comments * address review comments * cleanup * add domainRestartVersion to the example * move the patch part into the existing patch step and remove log messages * minor doc edit * refactored a little * Correct overrideDistributionStrategy other places * Add correct the other misspelled field name Co-authored-by: Ryan Eberhard <ryan.eberhard@oracle.com> * OWLS 85530: OPERATOR INTROSPECTOR THROWS VALIDATION ERRORS FOR STATIC CLUSTER (#2014) * getDynamicServersOrNone doesn't throw exception with 14.1.1.0 * check for ServerTemplate for Dynamic Servers * Check if DynamicServers mbean exist * Add test to introspect configured cluster created by online WLST * Document configured cluster introspection test * documentation updates to README and referencing JIRA OWLS-85530 * modify istio installation script (#2015) Co-authored-by: ANTARYAMI.PANIGRAHI@ORACLE.COM <anpanigr@anpanigr-2.subnet1ad3phx.devweblogicphx.oraclevcn.com> * Added automations to verify domain in image samples using wlst and wdt (#2016) * Added DII sample tests jenkins-ignore * Changes based on the comments * Use spec.containers[].image instead of status.containerStatuses[].image (#2018) * Fix dedicated mode test (#2019) * Detect missing CRD during domain checks * refactoring: convert DomainNamespaces to use instance variables rather than statics * Avoid creating namespace watchers when using dedicate mode * remove obsolete fields and methods * Xc owls85579 (#2024) * fix intermittent error in Jenkins * assert not null for admin pod log * add retry to get admin server pod log * cleanup * set sinceSeconds when get admin server pod log * print out pod log * increase sinceSeconds when getting pod log * return previous terminated pod log * fix error * remove some debug flag * remove commented out lines * Owls85582 take all ALWAYS servers before considering rest of the servers when meet cluster replicas requirement (#2020) * add all Always servers before consider IfNeeded servers * add unit test cases for NEVER policy * clean up unit tests * minor changes to the unit tests * resort the final server startup list * Release note updates (#2026) * Release note updates * Review comments * Review comments * Detect and shut down stuck server pods (#2027) * Detect and shut down stuck server pods * Send 0 grace period seconds to force delete * Log message after deleting stuck pod * Ignore testUpdateImageName if the image tag is 14.1.1.0-11 (#2017) * abort the test if the image tag is 14.1.1.0-11 * minor change * implement the custom annotation @AssumeWebLogicImage * checkstyple * adding the review comments * Owls83995 - Sample scripts to shutdown and start a specific managed server/cluster/domain (#2002) * owls-83995 - Scripts to start/stop a managed server/cluster/domain * fix method comments * Minor changes * Address review comments and fix script comments/usages. * Added integration tests, made few doc changes based on review comments and minor fix. * Clarify script usage, updated README file and minor changes. * Changes to add script usage details * Address PR review comments * Review comment and cleanup. * Documentation changes based on PR review comments. * Fully qualified replica value as per review comments * edit docs * edit README * Address PR review comments * Changes to address PR review comments and removed ItLifecycleSampleScripts class by adding methods in ItSamples * fix indentation * fix comment and typo * Added validation as per review comment. * changes to address review comment and minor cleanup * PR review comment - changes to assume default policy is IF_NEEDED if policy is not set at domain level. * changes for new algorithm as documented on http://aseng-wiki.us.oracle.com/asengwiki/pages/viewpage.action?pageId=5280694898 * More changes for new algorithm. * code refactoring and minor doc update. * Minor change for dynamic server name validation * Changes to address review comments. * More review comment changes and cleanup. * Unset policy to start independent (stadalone) maanged server instead of ALWAYS. * Latest review comment changes. * More changes based on review comments. * Chnages for latest review comments. * Remove unused action variable and assignments. * Fix the logic to display error when config map not found and return policy without quotes. * Changes for latest review comments. * Changes for latest round of review comments. * use printError instead of echo * Changes to remove integration tests and doc review comments. Co-authored-by: Rosemary Marano <rosemary.marano@oracle.com> * Use oracle:root to support running in the OpenShift restrictive SCC (#2007) * Use oracle:root to support running in the OpenShift restrictive SCC * Fix issues found in test * Clean-up * Add SECURITY.md * JRF mii Domain test class/infra for the mii RCU functionality testing (#2011) * first cut for ItJrfMiiDomain * minor change * addressing the review comments, adding em console verification * use default -ext as service suffix * minor change * minor change * OWLS85912 - Integration tests for domain lifecycle scripts added as part of OWLS-83995. (#2032) * Added integration tests for domain lifecycle scripts. * minor changes. * fix K8s setup doc (#2028) * owls-85910 - display correct minimumReplicas status for dynamic clusters (#2035) * Additional release note updates (#2034) * Response from pod delete REST call can be Pod or Status (#2038) * Ignore response value from delete operations * Correct api group * Switch from custom objects * Remove unnecessary whitespace changes * Owls85476 attempt to fix intermittent integration test issues in nightlies (#2037) * switch to kubectl and improve the test code * change testAddSecondApp to use kubectl as well * more changes * check main thread done as well * minor update * minor change * Added anti affinity to the wls pods (#2033) * added nfs * added antiaffity * added syncronized * fixed typo * fixed typo1 * fixed domain crd for itpodtemplate * style * added storageclass to pv * added storageclass to pvc * added storageclass to pvc1 * style * fixed hostpath * removed fss related code * style * removed unneeded file * Add comment to liveness probe (#2041) * Add comme…
Add maxConcurrentShutdown cluster configuration, and maxClusterConcurrentShutdown domain configuration:
The maximum number of managed servers that the operator will shutdown in parallel for the cluster in response to a change in replicas count for the cluster. If more managed servers need to be shutdown, the operator will wait until a managed server pod is terminated before shutting down the next managed server. The default value of maxConcurrentShutdown is 1. A value of 0 means all managed servers will shutdown in parallel. If replica count is set to 0 then all managed servers will shutdown in parallel. If domain is deleted using "kubectl delete domain" command then all managed servers will shutdown in parallel.
Because of a bug in Kubernetes java client (kubernetes-client/java#86), the async request step to delete pod was timing out after 10 seconds. With default retry strategy, the request was retried with increased timeout of 20 seconds. The second attempt to delete pod was also timing out and third retry attempt was failing with "404 - Not Found". The pod delete is being considered as success when delete fails with "404 - Not Found". This PR include changes to specify a custom retry strategy, in case of async pod delete with custom retry strategy, it will not retry the async delete pod step.
IT test suite passed with latest changes - https://build.weblogick8s.org:8443/job/weblogic-kubernetes-operator-kind-new/1774/