Skip to content

Commit

Permalink
chore(docs): optimize docs
Browse files Browse the repository at this point in the history
  • Loading branch information
CoderPoet committed Mar 5, 2024
1 parent ee19918 commit 11134bc
Show file tree
Hide file tree
Showing 2 changed files with 110 additions and 89 deletions.
99 changes: 61 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,45 @@
# opentelemetry (This is a community driven project)

English | [中文](README_CN.md)

[Opentelemetry](https://opentelemetry.io/) for [Kitex](https://github.com/cloudwego/kitex)

OpenTelemetry is an open source observability framework from CNCF that consist of a series of tools, APIs and SDKs, and it enables IT teams to detect, generate, collect, and export remote monitoring data for analysis and understanding of software performance and behavior.

The obs-opentelemetry extension is available in the kitex-contrib, which allows kitex to integrate OpenTelemetry with a simple setup.

## Feature

#### Provider
- [x] Out-of-the-box default opentelemetry provider

- [x] Out-of-the-box default opentelemetry provider
- [x] Support setting via environment variables

### Instrumentation

#### Tracing

- [x] Support server and client kitex rpc tracing
- [x] Support automatic transparent transmission of peer service through meta info

#### Metrics

- [x] Support kitex rpc metrics [R.E.D]
- [x] Support service topology map metrics [Service Topology Map]
- [x] Support go runtime metrics

#### Logging

- [x] Extend kitex logger based on logrus and zap
- [x] Implement tracing auto associated logs

## Configuration via environment variables

- [Exporter](https://opentelemetry.io/docs/reference/specification/protocol/exporter/)
- [SDK](https://opentelemetry.io/docs/reference/specification/sdk-environment-variables/#general-sdk-configuration)

## Server usage

```go
import (
...
Expand Down Expand Up @@ -60,6 +72,7 @@ func main() {
```

## Client usage

```go
import (
...
Expand Down Expand Up @@ -94,6 +107,7 @@ func main(){
## Tracing associated Logs

#### set logger impl

```go
import (
kitexlogrus "github.com/kitex-contrib/obs-opentelemetry/logging/logrus"
Expand Down Expand Up @@ -122,7 +136,6 @@ func (s *EchoImpl) Echo(ctx context.Context, req *api.Request) (resp *api.Respon
{"level":"debug","msg":"echo called: my request","span_id":"056e0cf9a8b2cec3","time":"2022-03-09T02:47:28+08:00","trace_flags":"01","trace_id":"33bdd3c81c9eb6cbc0fbb59c57ce088b"}
```


## Example

[Executable Example](https://github.com/cloudwego/kitex-examples/tree/main/opentelemetry)
Expand All @@ -135,87 +148,97 @@ func (s *EchoImpl) Echo(ctx context.Context, req *api.Request) (resp *api.Respon

Below is a table of RPC server metric instruments.

| Name | Instrument | Unit | Unit (UCUM) | Description | Status | Streaming |
|------|------------|------|-------------------------------------------|-------------|--------|-----------|
| `rpc.server.duration` | Histogram | milliseconds | `ms` | measures duration of inbound RPC | Recommended | N/A. While streaming RPCs may record this metric as start-of-batch to end-of-batch, it's hard to interpret in practice. |
| Name | Instrument | Unit | Unit (UCUM) | Description | Status | Streaming |
|-----------------------|------------|--------------|-------------|----------------------------------|-------------|--------------------------------------------------------------------------------------------------------------------------|
| `rpc.server.duration` | Histogram | milliseconds | `ms` | measures duration of inbound RPC | Recommended | N/A. While streaming RPCs may record this metric as start-of-batch to end-of-batch, it's hard to interpret in practice. |

#### Kitex Client

Below is a table of RPC client metric instruments. These apply to traditional
Below is a table of RPC client metric instruments. These apply to traditional
RPC usage, not streaming RPCs.

| Name | Instrument | Unit | Unit (UCUM) | Description | Status | Streaming |
|------|------------|------|-------------------------------------------|-------------|--------|-----------|
| `rpc.client.duration` | Histogram | milliseconds | `ms` | measures duration of outbound RPC | Recommended | N/A. While streaming RPCs may record this metric as start-of-batch to end-of-batch, it's hard to interpret in practice. |

| Name | Instrument | Unit | Unit (UCUM) | Description | Status | Streaming |
|-----------------------|------------|--------------|-------------|-----------------------------------|-------------|--------------------------------------------------------------------------------------------------------------------------|
| `rpc.client.duration` | Histogram | milliseconds | `ms` | measures duration of outbound RPC | Recommended | N/A. While streaming RPCs may record this metric as start-of-batch to end-of-batch, it's hard to interpret in practice. |

### R.E.D
The RED Method defines the three key metrics you should measure for every microservice in your architecture. We can calculate RED based on `rpc.server.duration`.

The RED Method defines the three key metrics you should measure for every microservice in your architecture. We can
calculate RED based on `rpc.server.duration`.

#### Rate

the number of requests, per second, you services are serving.

eg: QPS

```
sum(rate(rpc_server_duration_count{}[5m])) by (service_name, rpc_method)
```

#### Errors

the number of failed requests per second.

eg: Error ratio

```
sum(rate(rpc_server_duration_count{status_code="Error"}[5m])) by (service_name, rpc_method) / sum(rate(rpc_server_duration_count{}[5m])) by (service_name, rpc_method)
```

#### Duration

distributions of the amount of time each request takes

eg: P99 Latency

```
histogram_quantile(0.99, sum(rate(rpc_server_duration_bucket{}[5m])) by (le, service_name, rpc_method))
```

### Service Topology Map
The `rpc.server.duration` will record the peer service and the current service dimension. Based on this dimension, we can aggregate the service topology map

The `rpc.server.duration` will record the peer service and the current service dimension. Based on this dimension, we
can aggregate the service topology map

```
sum(rate(rpc_server_duration_count{}[5m])) by (service_name, peer_service)
```

### Runtime Metrics
| Name | Instrument | Unit | Unit (UCUM)) | Description |
|------|------------|------|-------------------------------------------|-------------|
| `process.runtime.go.cgo.calls` | Sum | - | - | Number of cgo calls made by the current process. |
| `process.runtime.go.gc.count` | Sum | - | - | Number of completed garbage collection cycles. |
| `process.runtime.go.gc.pause_ns` | Histogram | nanosecond | `ns` | Amount of nanoseconds in GC stop-the-world pauses. |
| `process.runtime.go.gc.pause_total_ns` | Histogram | nanosecond | `ns` | Cumulative nanoseconds in GC stop-the-world pauses since the program started. |
| `process.runtime.go.goroutines` | Gauge | - | - | measures duration of outbound RPC. |
| `process.runtime.go.lookups` | Sum | - | - | Number of pointer lookups performed by the runtime. |
| `process.runtime.go.mem.heap_alloc` | Gauge | bytes | `bytes` | Bytes of allocated heap objects. |
| `process.runtime.go.mem.heap_idle` | Gauge | bytes | `bytes` | Bytes in idle (unused) spans. |
| `process.runtime.go.mem.heap_inuse` | Gauge | bytes | `bytes` | Bytes in in-use spans. |
| `process.runtime.go.mem.heap_objects` | Gauge | - | - | Number of allocated heap objects. |
| `process.runtime.go.mem.live_objects` | Gauge | - | - | Number of live objects is the number of cumulative Mallocs - Frees. |
| `process.runtime.go.mem.heap_released` | Gauge | bytes | `bytes` | Bytes of idle spans whose physical memory has been returned to the OS. |
| `process.runtime.go.mem.heap_sys` | Gauge | bytes | `bytes` | Bytes of idle spans whose physical memory has been returned to the OS. |
| `runtime.uptime` | Sum | ms | `ms` | Milliseconds since application was initialized. |

| Name | Instrument | Unit | Unit (UCUM)) | Description |
|----------------------------------------|------------|------------|--------------|-------------------------------------------------------------------------------|
| `process.runtime.go.cgo.calls` | Sum | - | - | Number of cgo calls made by the current process. |
| `process.runtime.go.gc.count` | Sum | - | - | Number of completed garbage collection cycles. |
| `process.runtime.go.gc.pause_ns` | Histogram | nanosecond | `ns` | Amount of nanoseconds in GC stop-the-world pauses. |
| `process.runtime.go.gc.pause_total_ns` | Histogram | nanosecond | `ns` | Cumulative nanoseconds in GC stop-the-world pauses since the program started. |
| `process.runtime.go.goroutines` | Gauge | - | - | measures duration of outbound RPC. |
| `process.runtime.go.lookups` | Sum | - | - | Number of pointer lookups performed by the runtime. |
| `process.runtime.go.mem.heap_alloc` | Gauge | bytes | `bytes` | Bytes of allocated heap objects. |
| `process.runtime.go.mem.heap_idle` | Gauge | bytes | `bytes` | Bytes in idle (unused) spans. |
| `process.runtime.go.mem.heap_inuse` | Gauge | bytes | `bytes` | Bytes in in-use spans. |
| `process.runtime.go.mem.heap_objects` | Gauge | - | - | Number of allocated heap objects. |
| `process.runtime.go.mem.live_objects` | Gauge | - | - | Number of live objects is the number of cumulative Mallocs - Frees. |
| `process.runtime.go.mem.heap_released` | Gauge | bytes | `bytes` | Bytes of idle spans whose physical memory has been returned to the OS. |
| `process.runtime.go.mem.heap_sys` | Gauge | bytes | `bytes` | Bytes of idle spans whose physical memory has been returned to the OS. |
| `runtime.uptime` | Sum | ms | `ms` | Milliseconds since application was initialized. |

## Compatibility
The sdk of OpenTelemetry is fully compatible with 1.X opentelemetry-go. [see](https://github.com/open-telemetry/opentelemetry-go#compatibility)

The sdk of OpenTelemetry is fully compatible with 1.X
opentelemetry-go. [see](https://github.com/open-telemetry/opentelemetry-go#compatibility)

maintained by: [CoderPoet](https://github.com/CoderPoet)


## Dependencies
| **Library/Framework** | **Versions** | **Notes** |
| --- | --- | --- |
| go.opentelemetry.io/otel | v1.7.0 | ​<br /> |
| go.opentelemetry.io/otel/trace | v1.7.0 | ​<br /> |
| go.opentelemetry.io/otel/metric | v0.30.0 | ​<br /> |
| go.opentelemetry.io/otel/semconv | v1.7.0 | |
| go.opentelemetry.io/contrib/instrumentation/runtime | v0.30.0 | |
| kitex | v0.3.1 | |

| **Library/Framework** | **Versions** | **Notes** |
|-----------------------------------------------------|--------------|-----------|
| go.opentelemetry.io/otel | v1.19.0 | ​<br /> |
| go.opentelemetry.io/otel/trace | v1.19.0 | ​<br /> |
| go.opentelemetry.io/otel/metric | v1.19.0 | ​<br /> |
| go.opentelemetry.io/contrib/instrumentation/runtime | v0.45.0 | |
| kitex | v0.7.3 | |


Loading

0 comments on commit 11134bc

Please sign in to comment.