First and most important thing is, SkyWalking backend startup behaviours are driven by config/application.yml
.
Understood the setting file will help you to read this document.
The default startup scripts are /bin/oapService.sh
(.bat).
Read start up mode document to know other options
of starting backend.
The core concept behind this setting file is, SkyWalking collector is based on pure modularization design. End user can switch or assemble the collector features by their own requirements.
So, in application.yml
, there are three levels.
- Level 1, module name. Meaning this module is active in running mode.
- Level 2, provider option list and provider selector. Available providers are listed here with a selector to indicate which one will actually take effect,
if there is only one provider listed, the
selector
is optional and can be omitted. - Level 3. settings of the provider.
Example:
storage:
selector: mysql # the mysql storage will actually be activated, while the h2 storage takes no effect
h2:
driver: ${SW_STORAGE_H2_DRIVER:org.h2.jdbcx.JdbcDataSource}
url: ${SW_STORAGE_H2_URL:jdbc:h2:mem:skywalking-oap-db}
user: ${SW_STORAGE_H2_USER:sa}
metadataQueryMaxSize: ${SW_STORAGE_H2_QUERY_MAX_SIZE:5000}
mysql:
properties:
jdbcUrl: ${SW_JDBC_URL:"jdbc:mysql://localhost:3306/swtest"}
dataSource.user: ${SW_DATA_SOURCE_USER:root}
dataSource.password: ${SW_DATA_SOURCE_PASSWORD:root@1234}
dataSource.cachePrepStmts: ${SW_DATA_SOURCE_CACHE_PREP_STMTS:true}
dataSource.prepStmtCacheSize: ${SW_DATA_SOURCE_PREP_STMT_CACHE_SQL_SIZE:250}
dataSource.prepStmtCacheSqlLimit: ${SW_DATA_SOURCE_PREP_STMT_CACHE_SQL_LIMIT:2048}
dataSource.useServerPrepStmts: ${SW_DATA_SOURCE_USE_SERVER_PREP_STMTS:true}
metadataQueryMaxSize: ${SW_STORAGE_MYSQL_QUERY_MAX_SIZE:5000}
# other configurations
core
is the module.selector
selects one out of the all providers listed below, the unselected ones take no effect as if they were deleted.default
is the default implementor of core module.driver
,url
, ...metadataQueryMaxSize
are all setting items of the implementor.
At the same time, modules includes required and optional, the required modules provide the skeleton of backend,
even modularization supported pluggable, removing those modules are meaningless, for optional modules, some of them have
a provider implementation called none
, meaning it only provides a shell with no actual logic, typically such as telemetry.
Setting -
to the selector
means this whole module will be excluded at runtime.
We highly recommend you don't try to change APIs of those modules, unless you understand SkyWalking project and its codes very well.
List the required modules here
- Core. Do basic and major skeleton of all data analysis and stream dispatch.
- Cluster. Manage multiple backend instances in a cluster, which could provide high throughputs process capabilities.
- Storage. Make the analysis result persistence.
- Query. Provide query interfaces to UI.
For Cluster and Storage have provided multiple implementors(providers), see Cluster management and Choose storage documents in the link list.
Also, several receiver modules are provided.
Receiver is the module in charge of accepting incoming data requests to backend. Most(all) provide
service by some network(RPC) protocol, such as gRPC, HTTPRestful.
The receivers have many different module names, you could
read Set receivers document in the link list.
After understand the setting file structure, you could choose your interesting feature document. We recommend you to read the feature documents in our following order.
- Overriding settings in application.yml is supported
- IP and port setting. Introduce how IP and port set and be used.
- Backend init mode startup. How to init the environment and exit graciously. Read this before you try to initial a new cluster.
- Cluster management. Guide you to set backend server in cluster mode.
- Deploy in kubernetes. Guide you to build and use SkyWalking image, and deploy in k8s.
- Choose storage. As we know, in default quick start, backend is running with H2 DB. But clearly, it doesn't fit the product env. In here, you could find what other choices do you have. Choose the one you like, we are also welcome anyone to contribute new storage implementor,
- Set receivers. You could choose receivers by your requirements, most receivers are harmless, at least our default receivers are. You would set and active all receivers provided.
- Token authentication. You could add token authentication mechanisms to avoid
OAP
receiving untrusted data. - Do trace sampling at backend. This sample keep the metrics accurate, only don't save some of traces in storage based on rate.
- Follow slow DB statement threshold config document to understand that, how to detect the Slow database statements(including SQL statements) in your system.
- Official OAL scripts. As you known from our OAL introduction, most of backend analysis capabilities based on the scripts. Here is the description of official scripts, which helps you to understand which metrics data are in process, also could be used in alarm.
- Alarm. Alarm provides a time-series based check mechanism. You could set alarm rules targeting the analysis oal metrics objects.
- Advanced deployment options. If you want to deploy backend in very large scale and support high payload, you may need this.
- Metrics exporter. Use metrics data exporter to forward metrics data to 3rd party system.
- Time To Live (TTL). Metrics and trace are time series data, TTL settings affect the expired time of them.
- Dynamic Configuration. Make configuration of OAP changed dynamic, from remote service or 3rd party configuration management system.
- Uninstrumented Gateways. Configure gateways/proxies that are not supported by SkyWalking agent plugins, to reflect the delegation in topology graph.
OAP backend cluster itself underlying is a distributed streaming process system. For helping the Ops team, we provide the telemetry for OAP backend itself. Follow document to use it.
IMPORTANT: Agent hot reboot requires both of the OAP nodes and agents to be version 6.3.0 or higher. The reboot procedure works by the heartbeat between OAP nodes and the agents:
- The agent sends a heartbeat package to the OAP server;
- The OAP server just restarted and found no metadata for this agent, then it sends a reset command to the specific agent;
- The agent received the reset command and re-register itself to the OAP node.
The agent reboot mechanism is not designed for every scenarios where agent need to reboot, but only the scenario where the backend servers are to be upgraded with all storage data deleted/erased, therefore, there're some noteworthy limitations:
- Partially deleting the storage data may not work as expected, you MUST delete all the storage data.
- Set an appropriate threshold of config
agent.cool_down_threshold
to wait before the agents re-registering themselves to backend to avoid "dirty data", seeagent.cool_down_threshold
for more detail.
SkyWalking provides downsampling time series metrics features. Query and storage at each time dimension(minute, hour, day, month metrics indexes) related to timezone when doing time format.
For example, metrics time will be formatted like YYYYMMDDHHmm in minute dimension metrics, which format process is timezone related.
In default, SkyWalking OAP backend choose the OS default timezone. If you want to override it, please follow Java and OS documents to do so.
SkyWalking provides browser UI, CLI and GraphQL ways to support extensions. But some users may have the idea to query data directly from the storage. Such as in ElasticSearch case, Kibana is a great tool to do this.
In default, due to reduce memory, network and storage space usages, SkyWalking saves id(s) only in the entity and metadata saved in the
*_inventory
entities only. But these tools usually don't support nested query, or don't work conveniently. In this special case,
SkyWalking provide a config to add all necessary name column(s) into the final metrics entities with ID as a trade-off.
Take a look at core/default/activeExtraModelColumns
config in the application.yaml
, and set it as true
to open this feature.
This feature wouldn't provide any new feature to the native SkyWalking scenarios, just for the 3rd party integration.