Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Owls85582 take all ALWAYS servers before considering rest of the servers when meet cluster replicas requirement #2020

Merged
merged 6 commits into from
Nov 3, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,11 @@
import oracle.kubernetes.operator.work.NextAction;
import oracle.kubernetes.operator.work.Packet;
import oracle.kubernetes.operator.work.Step;
import oracle.kubernetes.utils.OperatorUtils;
import oracle.kubernetes.weblogic.domain.model.Domain;
import oracle.kubernetes.weblogic.domain.model.ServerSpec;

import static java.util.Comparator.comparing;
import static oracle.kubernetes.operator.DomainStatusUpdater.MANAGED_SERVERS_STARTING_PROGRESS_REASON;
import static oracle.kubernetes.operator.DomainStatusUpdater.createProgressingStep;

Expand Down Expand Up @@ -118,23 +120,30 @@ public NextAction apply(Packet packet) {
private void addServersToFactory(@Nonnull ServersUpStepFactory factory, @Nonnull WlsDomainConfig wlsDomainConfig) {
Set<String> clusteredServers = new HashSet<>();

List<ServerConfig> pendingServers = new ArrayList<>();
wlsDomainConfig.getClusterConfigs().values()
.forEach(wlsClusterConfig -> addClusteredServersToFactory(factory, clusteredServers, wlsClusterConfig));
.forEach(wlsClusterConfig -> addClusteredServersToFactory(
factory, clusteredServers, wlsClusterConfig, pendingServers));

wlsDomainConfig.getServerConfigs().values().stream()
.filter(wlsServerConfig -> !clusteredServers.contains(wlsServerConfig.getName()))
.forEach(wlsServerConfig -> factory.addServerIfNeeded(wlsServerConfig, null));
.forEach(wlsServerConfig -> factory.addServerIfAlways(wlsServerConfig, null, pendingServers));

for (ServerConfig serverConfig : pendingServers) {
factory.addServerIfNeeded(serverConfig.wlsServerConfig, serverConfig.wlsClusterConfig);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this approach affect the guaranteed 'lexi numeric' order in which we start or shutdown servers? Or which servers are reported in status? (For example, when we're shutting down a cluster's servers one-at-a-time, the goal is to shutdown only the 'highest' server first, then the second highest, and so on.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach considers this. The pending list is in the original order. We have unit test cases covering this too.

Copy link
Member Author

@doxiao doxiao Oct 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the final list does not maintain the original order. For example, if server3 is always and replicas count is 2, server1 and server3 will be started, and server3 will be started before server1.

Copy link
Member Author

@doxiao doxiao Oct 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the sample example, if cluster later scales down, server1 will be taken down, which is the correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only behavior difference is the startup order among servers that need to be started up in the same round of make right check; servers with ALWAYS policy will be started before the servers with If-needed policy.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want the overall startup and shutdown order to be very intuitive, predictable, and the exact reverse of each-other.

Question 1: I think it'd be better if server1 always first in a cluster, then server2, and so on, if they aren't configured to start concurrently -- regardless of whether any of the servers are marked 'always'. So, based on your analysis of the cluster use case ^^^ where server3 (marked always) starts before server1 (marked if_needed), it sounds like this pull may need to be refined?

Question 2: As for shutdown, if replica count is 3, and server1 & 3 are 'if needed' while server2 is always, then reducing replica count to 1 (or 0) should always shutdown server3 first and then shutdown server1 second. (Side note: when setting the entire cluster to NEVER the entire cluster is expected to shutdown concurrently, so no worries there.) Based on your analysis, it sounds like this will be honored?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why you started the ALWAYS servers first, but I agree that it's more important that servers be started (and stopped) in a predictable and consistent order. We've had a few customers ask about startup ordering, so I could see us doing something like that in a future release. I think that means we need to keep the order very, very simple now so that customers would understand what they are configuring later.

Copy link
Member Author

@doxiao doxiao Oct 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Yes" to both questions. Good catch on sequential startup order. I agree we need to resort the final list.

Copy link
Member Author

@doxiao doxiao Oct 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
}

private void addClusteredServersToFactory(@Nonnull ServersUpStepFactory factory, Set<String> clusteredServers,
@Nonnull WlsClusterConfig wlsClusterConfig) {
private void addClusteredServersToFactory(
@Nonnull ServersUpStepFactory factory, Set<String> clusteredServers,
@Nonnull WlsClusterConfig wlsClusterConfig, List<ServerConfig> pendingServers) {
factory.logIfInvalidReplicaCount(wlsClusterConfig);
// We depend on 'getServerConfigs()' returning an ascending 'numero-lexi'
// sorted list so that a cluster's "lowest named" servers have precedence
// when the cluster's replica count is lower than the WL cluster size.
wlsClusterConfig.getServerConfigs()
.forEach(wlsServerConfig -> {
factory.addServerIfNeeded(wlsServerConfig, wlsClusterConfig);
factory.addServerIfAlways(wlsServerConfig, wlsClusterConfig, pendingServers);
clusteredServers.add(wlsServerConfig.getName());
});
}
Expand All @@ -148,7 +157,7 @@ Step createServerStep(
static class ServersUpStepFactory {
final WlsDomainConfig domainTopology;
final Domain domain;
Collection<ServerStartupInfo> startupInfos;
List<ServerStartupInfo> startupInfos;
List<ServerShutdownInfo> shutdownInfos = new ArrayList<>();
final Collection<String> servers = new ArrayList<>();
final Collection<String> preCreateServers = new ArrayList<>();
Expand Down Expand Up @@ -176,20 +185,15 @@ boolean shouldPrecreateServerService(ServerSpec server) {

private void addServerIfNeeded(@Nonnull WlsServerConfig serverConfig, WlsClusterConfig clusterConfig) {
String serverName = serverConfig.getName();
if (servers.contains(serverName) || serverName.equals(domainTopology.getAdminServerName())) {
if (adminServerOrDone(serverName)) {
return;
}

String clusterName = clusterConfig == null ? null : clusterConfig.getClusterName();
String clusterName = getClusterName(clusterConfig);
ServerSpec server = domain.getServer(serverName, clusterName);

if (server.shouldStart(getReplicaCount(clusterName))) {
servers.add(serverName);
if (shouldPrecreateServerService(server)) {
preCreateServers.add(serverName);
}
addStartupInfo(new ServerStartupInfo(serverConfig, clusterName, server));
addToCluster(clusterName);
addServerToStart(serverConfig, clusterName, server);
} else if (shouldPrecreateServerService(server)) {
preCreateServers.add(serverName);
addShutdownInfo(new ServerShutdownInfo(serverConfig, clusterName, server, true));
Expand All @@ -198,6 +202,15 @@ private void addServerIfNeeded(@Nonnull WlsServerConfig serverConfig, WlsCluster
}
}

private void addServerToStart(@Nonnull WlsServerConfig serverConfig, String clusterName, ServerSpec server) {
servers.add(serverConfig.getName());
if (shouldPrecreateServerService(server)) {
preCreateServers.add(serverConfig.getName());
}
addStartupInfo(new ServerStartupInfo(serverConfig, clusterName, server));
addToCluster(clusterName);
}

boolean exceedsMaxConfiguredClusterSize(WlsClusterConfig clusterConfig) {
if (clusterConfig != null) {
String clusterName = clusterConfig.getClusterName();
Expand All @@ -218,6 +231,11 @@ private Step createNextStep(Step next) {
}

Collection<ServerStartupInfo> getStartupInfos() {
if (startupInfos != null) {
Collections.sort(
startupInfos,
comparing((ServerStartupInfo sinfo) -> OperatorUtils.getSortingString(sinfo.getServerName())));
}
return startupInfos;
}

Expand Down Expand Up @@ -291,5 +309,41 @@ private void logIfInvalidReplicaCount(WlsClusterConfig clusterConfig) {
logIfReplicasExceedsClusterServersMax(clusterConfig);
logIfReplicasLessThanClusterServersMin(clusterConfig);
}

private void addServerIfAlways(
WlsServerConfig wlsServerConfig,
WlsClusterConfig wlsClusterConfig,
List<ServerConfig> pendingServers) {
String serverName = wlsServerConfig.getName();
if (adminServerOrDone(serverName)) {
return;
}
String clusterName = getClusterName(wlsClusterConfig);
ServerSpec server = domain.getServer(serverName, clusterName);
if (server.alwaysStart()) {
addServerToStart(wlsServerConfig, clusterName, server);
} else {
pendingServers.add(new ServerConfig(wlsClusterConfig, wlsServerConfig));
}
}

private boolean adminServerOrDone(String serverName) {
return servers.contains(serverName) || serverName.equals(domainTopology.getAdminServerName());
}

private static String getClusterName(WlsClusterConfig clusterConfig) {
return clusterConfig == null ? null : clusterConfig.getClusterName();
}

}

private static class ServerConfig {
protected WlsServerConfig wlsServerConfig;
protected WlsClusterConfig wlsClusterConfig;

ServerConfig(WlsClusterConfig cluster, WlsServerConfig server) {
this.wlsClusterConfig = cluster;
this.wlsServerConfig = server;
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,12 @@ public boolean shouldStart(int currentReplicas) {
}
return super.shouldStart(currentReplicas);
}

@Override
public boolean alwaysStart() {
if (isStartAdminServerOnly()) {
return false;
}
return super.alwaysStart();
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -160,4 +160,6 @@ public interface ServerSpec {
String getClusterRestartVersion();

String getServerRestartVersion();

boolean alwaysStart();
}
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,10 @@ private ServerStartPolicy getEffectiveServerStartPolicy() {
.orElse(ServerStartPolicy.getDefaultPolicy());
}

public boolean alwaysStart() {
return ServerStartPolicy.ALWAYS.equals(getEffectiveServerStartPolicy());
}

@Nonnull
@Override
public ProbeTuning getLivenessProbe() {
Expand Down
Loading