Skip to content

Commit

Permalink
Support output key parameters in the booting logs. (#11506)
Browse files Browse the repository at this point in the history
  • Loading branch information
wu-sheng authored Nov 6, 2023
1 parent a34a0b7 commit f8c6855
Show file tree
Hide file tree
Showing 26 changed files with 280 additions and 55 deletions.
3 changes: 3 additions & 0 deletions docs/en/changes/changes.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
* Support GraalVM native-image (Experimental).
* Correct the file format and fix typos in the filenames for monitoring Kafka's e2e tests.
* Support extract timestamp from patterned datetime string in LAL.
* Support output key parameters in the booting logs.

#### UI

Expand All @@ -60,5 +61,7 @@
* Add missing metrics to the `OpenTelemetry Metrics` doc.
* Polish docs of `Concepts and Designs`.
* Fix incorrect notes of slowCacheReadThreshold.
* Update OAP setup and cluster coordinator docs to explain new booting parameters table in the logs, and how to setup
cluster mode.

All issues and pull requests are [here](https://github.com/apache/skywalking/milestone/193?closed=1)
39 changes: 27 additions & 12 deletions docs/en/setup/backend/backend-cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ There are various ways to manage the cluster in the backend. Choose the one that
In the `application.yml` file, there are default configurations for the aforementioned coordinators under the
section `cluster`. You can specify any of them in the `selector` property to enable it.

# Cloud Native
## Kubernetes

The required backend clusters are deployed inside Kubernetes. See the guides in [Deploy in kubernetes](backend-k8s.md).
Expand Down Expand Up @@ -54,6 +55,20 @@ containers:
Read [the complete helm](https://github.com/apache/skywalking-helm/blob/476afd51d44589c77a4cbaac950272cd5d064ea9/chart/skywalking/templates/oap-deployment.yaml#L125) for more details.
# Traditional Coordinator
**NOTICE**
In all the following coordinators, `oap.internal.comm.host`:`oap.internal.comm.port` is registered as the ID
and address for the current OAP node. By default, because they are same in all OAP nodes, the registrations are conflicted,
and (may) show as one registered node, which actually would be the node itself. **In this case, the cluster mode is NOT working.**

Please check the registered nodes on your coordinator servers, to make the registration information unique for every node.
You could have two options

1. Change `core/gRPCHost`(`oap.internal.comm.host`) and `core/gRPCPort`(`oap.internal.comm.port`) for internal,
and [setup external communication channels](backend-expose.md) for data reporting and query.
2. Use `internalComHost` and `internalComPort` in the config to provide a unique host and port for every OAP node. This
host name port should be accessible for other OAP nodes.

## Zookeeper coordinator

Expand Down Expand Up @@ -85,11 +100,11 @@ Note:
- If you set `schema` as `digest`, the password of the expression is set in **clear text**.

In some cases, the OAP default gRPC host and port in the core are not suitable for internal communication among the OAP
nodes.
nodes, such as the default host(`0.0.0.0`) should not be used in cluster mode.
The following settings are provided to set the host and port manually, based on your own LAN env.

- internalComHost: The registered host and other OAP nodes use this to communicate with the current node.
- internalComPort: the registered port and other OAP nodes use this to communicate with the current node.
- internalComHost: The exposed host name for other OAP nodes in the cluster internal communication.
- internalComPort: the exposed port for other OAP nodes in the cluster internal communication.

```yaml
zookeeper:
Expand Down Expand Up @@ -119,11 +134,11 @@ cluster:

Same as the Zookeeper coordinator,
in some cases, the OAP default gRPC host and port in the core are not suitable for internal communication among the OAP
nodes.
nodes, such as the default host(`0.0.0.0`) should not be used in cluster mode.
The following settings are provided to set the host and port manually, based on your own LAN env.

- internalComHost: The registered host and other OAP nodes use this to communicate with the current node.
- internalComPort: The registered port and other OAP nodes use this to communicate with the current node.
- internalComHost: The exposed host name for other OAP nodes in the cluster internal communication.
- internalComPort: the exposed port for other OAP nodes in the cluster internal communication.

## Etcd

Expand All @@ -146,11 +161,11 @@ cluster:

Same as the Zookeeper coordinator,
in some cases, the OAP default gRPC host and port in the core are not suitable for internal communication among the OAP
nodes.
nodes, such as the default host(`0.0.0.0`) should not be used in cluster mode.
The following settings are provided to set the host and port manually, based on your own LAN env.

- internalComHost: The registered host and other OAP nodes use this to communicate with the current node.
- internalComPort: The registered port and other OAP nodes use this to communicate with the current node.
- internalComHost: The exposed host name for other OAP nodes in the cluster internal communication.
- internalComPort: the exposed port for other OAP nodes in the cluster internal communication.

## Nacos

Expand All @@ -175,8 +190,8 @@ nacos:

Same as the Zookeeper coordinator,
in some cases, the OAP default gRPC host and port in the core are not suitable for internal communication among the OAP
nodes.
nodes, such as the default host(`0.0.0.0`) should not be used in cluster mode.
The following settings are provided to set the host and port manually, based on your own LAN env.

- internalComHost: The registered host and other OAP nodes use this to communicate with the current node.
- internalComPort: The registered port and other OAP nodes use this to communicate with the current node.
- internalComHost: The exposed host name for other OAP nodes in the cluster internal communication.
- internalComPort: the exposed port for other OAP nodes in the cluster internal communication.
67 changes: 56 additions & 11 deletions docs/en/setup/backend/backend-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,62 @@ The default startup scripts are `/bin/oapService.sh`(.bat).
Read the [start up mode](backend-start-up-mode.md) document to learn other ways to start up the backend.


### Key Parameters In The Booting Logs
After the OAP booting process completed, you should be able to see all important parameters listed in the logs.

```
2023-11-06 21:10:45,988 org.apache.skywalking.oap.server.starter.OAPServerBootstrap 67 [main] INFO [] - The key booting parameters of Apache SkyWalking OAP are listed as following.
Running Mode | null
TTL.metrics | 7
TTL.record | 3
Version | 9.7.0-SNAPSHOT-92af797
module.agent-analyzer.provider | default
module.ai-pipeline.provider | default
module.alarm.provider | default
module.aws-firehose.provider | default
module.cluster.provider | standalone
module.configuration-discovery.provider | default
module.configuration.provider | none
module.core.provider | default
module.envoy-metric.provider | default
module.event-analyzer.provider | default
module.log-analyzer.provider | default
module.logql.provider | default
module.promql.provider | default
module.query.provider | graphql
module.receiver-browser.provider | default
module.receiver-clr.provider | default
module.receiver-ebpf.provider | default
module.receiver-event.provider | default
module.receiver-jvm.provider | default
module.receiver-log.provider | default
module.receiver-meter.provider | default
module.receiver-otel.provider | default
module.receiver-profile.provider | default
module.receiver-register.provider | default
module.receiver-sharing-server.provider | default
module.receiver-telegraf.provider | default
module.receiver-trace.provider | default
module.service-mesh.provider | default
module.storage.provider | h2
module.telemetry.provider | none
oap.external.grpc.host | 0.0.0.0
oap.external.grpc.port | 11800
oap.external.http.host | 0.0.0.0
oap.external.http.port | 12800
oap.internal.comm.host | 0.0.0.0
oap.internal.comm.port | 11800
```

- `oap.external.grpc.host`:`oap.external.grpc.port` is for reporting telemetry data through gRPC channel, including
native agents, OTEL.
- `oap.external.http.host`:`oap.external.http.port` is for reporting telemetry data through HTTP channel and query,
including native GraphQL(UI), PromQL, LogQL.
- `oap.internal.comm.host`:`oap.internal.comm.port` is for OAP cluster internal communication via gRPC/HTTP2 protocol.
The default host(`0.0.0.0`) is not suitable for the cluster mode, unless in k8s deployment. Please
read [Cluster Doc](backend-cluster.md) to understand how to set up the SkyWalking backend in the cluster mode.

## application.yml
SkyWalking backend startup behaviours are driven by `config/application.yml`. Understanding the settings file will help you read this document.

Expand Down Expand Up @@ -107,14 +163,3 @@ For example, metrics time will be formatted like yyyyMMddHHmm in minute dimensio
By default, SkyWalking's OAP backend chooses the **OS default timezone**.
Please follow the Java and OS documents if you want to override the timezone.

#### How to query the storage directly from a 3rd party tool?
SkyWalking provides different options based on browser UI, CLI and GraphQL to support extensions. But some users may want to query data
directly from the storage. For example, in the case of ElasticSearch, Kibana is a great tool for doing this.

By default, SkyWalking saves based64-encoded ID(s) only in metrics entities to reduce memory, network and storage space usages.
But these tools usually don't support nested queries and are not convenient to work with. For these exceptional reasons,
SkyWalking provides a config to add all necessary name column(s) into the final metrics entities with ID as a trade-off.

Take a look at `core/default/activeExtraModelColumns` config in the `application.yaml`, and set it as `true` to enable this feature.

Note that this feature is simply for 3rd party integration and doesn't provide any new features to native SkyWalking use cases.
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@

package org.apache.skywalking.oap.server.cluster.plugin.zookeeper;

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import lombok.Getter;
import org.apache.curator.x.discovery.ServiceDiscovery;
import org.apache.skywalking.oap.server.core.cluster.ClusterCoordinator;
Expand Down Expand Up @@ -45,10 +48,6 @@
import org.testcontainers.junit.jupiter.Testcontainers;
import org.testcontainers.utility.DockerImageName;

import java.util.ArrayList;
import java.util.Collections;
import java.util.List;

import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertFalse;
import static org.junit.jupiter.api.Assertions.assertTrue;
Expand All @@ -73,7 +72,7 @@ public class ClusterModuleZookeeperProviderFunctionalIT {
@BeforeEach
public void init() {
Mockito.when(telemetryProvider.getService(MetricsCreator.class))
.thenReturn(new MetricsCreatorNoop());
.thenReturn(new MetricsCreatorNoop());
TelemetryModule telemetryModule = Mockito.spy(TelemetryModule.class);
Whitebox.setInternalState(telemetryModule, "loadedProvider", telemetryProvider);
Mockito.when(moduleManager.find(TelemetryModule.NAME)).thenReturn(telemetryModule);
Expand Down Expand Up @@ -229,7 +228,7 @@ private ClusterModuleZookeeperProvider createProvider(String namespace) throws E
}

private ClusterModuleZookeeperProvider createProvider(String namespace, String internalComHost,
int internalComPort) throws Exception {
int internalComPort) throws Exception {
ClusterModuleZookeeperProvider provider = new ClusterModuleZookeeperProvider();
provider.setManager(moduleManager);
ClusterModuleZookeeperConfig moduleConfig = new ClusterModuleZookeeperConfig();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ public void setUp() throws Exception {
final ApplicationConfiguration applicationConfiguration = new ApplicationConfiguration();
loadConfig(applicationConfiguration);

final ModuleManager moduleManager = new ModuleManager();
final ModuleManager moduleManager = new ModuleManager("Test");
moduleManager.init(applicationConfiguration);

provider = (ApolloConfigurationTestProvider) moduleManager.find(ApolloConfigurationTestModule.NAME).provider();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ public void setUp() throws Exception {
final ApplicationConfiguration applicationConfiguration = new ApplicationConfiguration();
loadConfig(applicationConfiguration);

final ModuleManager moduleManager = new ModuleManager();
final ModuleManager moduleManager = new ModuleManager("Test");
moduleManager.init(applicationConfiguration);

provider = (ConsulConfigurationTestProvider) moduleManager.find(ConsulConfigurationTestModule.NAME).provider();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ public void before() throws Exception {
final ApplicationConfiguration applicationConfiguration = new ApplicationConfiguration();
loadConfig(applicationConfiguration);

final ModuleManager moduleManager = new ModuleManager();
final ModuleManager moduleManager = new ModuleManager("Test");
moduleManager.init(applicationConfiguration);

provider = (EtcdConfigurationTestProvider) moduleManager.find(EtcdConfigurationTestModule.NAME).provider();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ public void setUp() throws Exception {
final ApplicationConfiguration applicationConfiguration = new ApplicationConfiguration();
loadConfig(applicationConfiguration);

final ModuleManager moduleManager = new ModuleManager();
final ModuleManager moduleManager = new ModuleManager("Test");
moduleManager.init(applicationConfiguration);

provider = (NacosConfigurationTestProvider) moduleManager.find(NacosConfigurationTestModule.NAME).provider();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ public void setUp() throws Exception {
final ApplicationConfiguration applicationConfiguration = new ApplicationConfiguration();
loadConfig(applicationConfiguration);

final ModuleManager moduleManager = new ModuleManager();
final ModuleManager moduleManager = new ModuleManager("Test");
moduleManager.init(applicationConfiguration);

provider = (MockZookeeperConfigurationProvider) moduleManager.find(MockZookeeperConfigurationModule.NAME)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,11 @@ public void prepare() throws ServiceNotProvidedException, ModuleStartException {
} else {
grpcServer = new GRPCServer(moduleConfig.getGRPCHost(), moduleConfig.getGRPCPort());
}
setBootingParameter("oap.internal.comm.host", moduleConfig.getGRPCHost());
setBootingParameter("oap.internal.comm.port", moduleConfig.getGRPCPort());
setBootingParameter("oap.external.grpc.host", moduleConfig.getGRPCHost());
setBootingParameter("oap.external.grpc.port", moduleConfig.getGRPCPort());

if (moduleConfig.getMaxConcurrentCallsPerConnection() > 0) {
grpcServer.setMaxConcurrentCallsPerConnection(moduleConfig.getMaxConcurrentCallsPerConnection());
}
Expand All @@ -244,6 +249,8 @@ public void prepare() throws ServiceNotProvidedException, ModuleStartException {
.maxRequestHeaderSize(
moduleConfig.getHttpMaxRequestHeaderSize())
.build();
setBootingParameter("oap.external.http.host", moduleConfig.getRestHost());
setBootingParameter("oap.external.http.port", moduleConfig.getRestPort());
httpServer = new HTTPServer(httpServerConfig);
httpServer.initialize();

Expand Down Expand Up @@ -335,10 +342,12 @@ public void prepare() throws ServiceNotProvidedException, ModuleStartException {
throw new ModuleStartException(
"Metric TTL should be at least 2 days, current value is " + moduleConfig.getMetricsDataTTL());
}
setBootingParameter("TTL.metrics", moduleConfig.getMetricsDataTTL());
if (moduleConfig.getRecordDataTTL() < 2) {
throw new ModuleStartException(
"Record TTL should be at least 2 days, current value is " + moduleConfig.getRecordDataTTL());
}
setBootingParameter("TTL.record", moduleConfig.getRecordDataTTL());

final MetricsStreamProcessor metricsStreamProcessor = MetricsStreamProcessor.getInstance();
metricsStreamProcessor.setL1FlushPeriod(moduleConfig.getL1FlushPeriod());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ public abstract class MockModuleManager extends ModuleManager {
private final Map<String, ModuleProviderHolder> moduleProviderHolderMap = Maps.newHashMap();

public MockModuleManager() {
super("Mock");
init();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ public abstract class MockModuleManager extends ModuleManager {
private final Map<String, ModuleProviderHolder> moduleProviderHolderMap = Maps.newHashMap();

public MockModuleManager() {
super("Test");
init();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
import java.util.Properties;

/**
* Modulization configurations. The {@link ModuleManager} is going to start, lookup, start modules based on this.
* Modularization configurations. The {@link ModuleManager} is going to start, lookup, start modules based on this.
*/
public class ApplicationConfiguration {
private HashMap<String, ModuleConfiguration> modules = new HashMap<>();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,11 @@ public final String name() {
* @param configuration of this module
* @throws ProviderNotFoundException when even don't find a single one providers.
*/
void prepare(ModuleManager moduleManager, ApplicationConfiguration.ModuleConfiguration configuration,
ServiceLoader<ModuleProvider> moduleProviderLoader) throws ProviderNotFoundException, ServiceNotProvidedException, ModuleConfigException, ModuleStartException {
void prepare(ModuleManager moduleManager,
ApplicationConfiguration.ModuleConfiguration configuration,
ServiceLoader<ModuleProvider> moduleProviderLoader,
TerminalFriendlyTable bootingParameters)
throws ProviderNotFoundException, ServiceNotProvidedException, ModuleConfigException, ModuleStartException {
for (ModuleProvider provider : moduleProviderLoader) {
if (!configuration.has(provider.name())) {
continue;
Expand All @@ -72,6 +75,7 @@ void prepare(ModuleManager moduleManager, ApplicationConfiguration.ModuleConfigu
loadedProvider = provider;
loadedProvider.setManager(moduleManager);
loadedProvider.setModuleDefine(this);
loadedProvider.setBootingParameters(bootingParameters);
} else {
throw new DuplicateProviderException(
this.name() + " module has one " + loadedProvider.name() + "[" + loadedProvider
Expand Down
Loading

0 comments on commit f8c6855

Please sign in to comment.