Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(instrumentation): add OpenTelemetry tracing and metrics with basic configurations #5175

Merged
merged 90 commits into from
Oct 11, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
a5a7f42
feat(instrumentation): create basic tracer and meter with console exp…
Sep 15, 2022
9e1b2d0
style: fix overload and cli autocomplete
jina-bot Sep 15, 2022
514792a
feat(instrumentation): move the instrumentation package to the serve …
Sep 16, 2022
c3b0c37
feat(instrumentation): provide options to enable tracing and metrics …
Sep 16, 2022
2269b57
feat(instrumentation): add the correct grpc opentelmetery insturmenta…
Sep 19, 2022
14cb744
feat(serve): instrument grpc server and channel with interceptors
Sep 19, 2022
f53be22
style: fix overload and cli autocomplete
jina-bot Sep 19, 2022
a4a4621
feat(instrumentation): provide opentelemety context from the grpc cli…
Sep 20, 2022
78efb44
feat(instrumentation): check for opentelemetry environment variables …
Sep 20, 2022
7116e9f
feat(instrumentation): create InstrumentationMixin for server and cli…
Sep 20, 2022
92d3679
chore(instrumentation): use absolute module import
Sep 21, 2022
eb0ccd3
feat(instrumentation): trace http and websocket server and clients
Sep 21, 2022
38cae61
chore(instrumentation): update/add new opentelemetry arguments
Sep 21, 2022
45d1794
feat(instrumentation): globally disable tracing health check requests
Sep 21, 2022
b107f80
feat(instrumentation): add InstrumentationMixIn for Head and Worker r…
Sep 22, 2022
cd17588
feat(instrumentation): disable tracing of ServerReflection and endpoi…
Sep 26, 2022
2e44270
test(instrumentation): add basic tracing and metrics tests for HTTP G…
Sep 26, 2022
a083146
test(instrumentation): move test common code for tracing and metrics …
Sep 26, 2022
30ee9e3
feat(instrumentation): enable tracing of flow internal and start up r…
Sep 26, 2022
3998e2f
test(instrumentation): move test common code to new base class
Sep 26, 2022
30409c2
test(instrumentation): test grpc gateway opentelemety instrumentation
Sep 26, 2022
e2ee862
feat(instrumentation): add Jaeger export agent and required configura…
Sep 27, 2022
a0bfaf8
chore(instrumentation): remove print statement
Sep 27, 2022
60be044
test(instrumentation): document spans in the grpc and http gateway in…
Sep 27, 2022
9da9eaf
Merge branch 'master' into feat-instrumentation-5155
Sep 27, 2022
adb96ba
style: fix overload and cli autocomplete
jina-bot Sep 27, 2022
0af8ffb
chore: remove print statement
Sep 27, 2022
a241f62
test(instrumentation): add instrumentaiton tests for websocket gateway
Sep 27, 2022
528e38b
fix: import openetelmetry api globally and the other dependencies onl…
Sep 27, 2022
47ed0a8
fix: use class name as default name when creating Executor instrument…
Sep 27, 2022
3f436da
fix: provide argparse arguments to AlternativeGateway
Sep 27, 2022
578e882
style: fix overload and cli autocomplete
jina-bot Sep 27, 2022
aa5a34a
style: fix overload and cli autocomplete
Sep 28, 2022
87c15f5
Merge branch 'master' into feat-instrumentation-5155
Sep 28, 2022
f7b4af4
style: fix overload and cli autocomplete
jina-bot Sep 28, 2022
3a2e1de
style: fix overload and cli autocomplete
Sep 28, 2022
82dad9c
style: fix overload and cli autocomplete
jina-bot Sep 28, 2022
42d00e6
fix: revert changes for Gateway implementation
Sep 29, 2022
9ade3b6
Merge branch 'master' into feat-instrumentation-5155
Sep 29, 2022
4132396
feat(instrumentation): remove init method from InstrumentationMixin
Sep 29, 2022
4efbbd7
feat(instrumentation): create vendor neutral opentelemetry export arg…
Sep 29, 2022
8e9abcb
style: fix overload and cli autocomplete
Sep 29, 2022
8eed211
feat(instrumentation): inject tracing variables from AsyncLoopRuntime…
Sep 30, 2022
175a399
style: fix overload and cli autocomplete
jina-bot Sep 30, 2022
030b980
feat(instrumentation): configure a OTLP collector for exporting trace…
Sep 30, 2022
c686498
style: fix overload and cli autocomplete
jina-bot Sep 30, 2022
6d21a3a
feat(instrumentation): return None for aio server interceptors if tra…
Oct 4, 2022
00c6c12
test: fix handling of optional args
Oct 5, 2022
92c0e1f
Merge branch 'master' into feat-instrumentation-5155
Oct 5, 2022
6e27829
fix: remove print debug statement
Oct 5, 2022
366a20e
fix: fix gateway class loading
alaeddine-13 Oct 5, 2022
822b541
Merge branch 'feat-instrumentation-5155' of github.com:jina-ai/jina i…
alaeddine-13 Oct 5, 2022
963b82d
feat(instrumentation): fix BaseGateway telemetry dependency injection
Oct 5, 2022
6433930
fix: fix WebsocketGateway loading
alaeddine-13 Oct 5, 2022
ffadb73
fix(instrumentation): correctly handle default executor runtime_args
Oct 5, 2022
3f6eeff
test(instrumentation): add integration tests for grpc, http and webso…
Oct 5, 2022
6b35909
test(instrumentation): parameterize instrumentation tests
Oct 5, 2022
2906369
test(instrumentation): remove outdated tests replaced by parametrized…
Oct 6, 2022
f1ad7a2
fix(instrumentation): fix executor instrumentation setup
Oct 6, 2022
d7bb8d9
fix(instrumentation): force spawn process when running flows in param…
Oct 6, 2022
5e31dca
feat(instrumentation): omit opentelemetry from cli args
Oct 6, 2022
c23f30a
style: fix overload and cli autocomplete
jina-bot Oct 6, 2022
bcc39a8
test: small test refactoring
JoanFM Oct 6, 2022
c540628
Merge branch 'master' into feat-instrumentation-5155
Oct 6, 2022
2ce9c67
style: fix overload and cli autocomplete
Oct 6, 2022
adcb457
style: fix overload and cli autocomplete
jina-bot Oct 6, 2022
0ae5f99
Merge branch 'master' into feat-instrumentation-5155
Oct 6, 2022
b45de43
test: dont set multiprocessing start method to spawn
Oct 6, 2022
bbd2fb8
fix: hide opentelemetry imports
Oct 6, 2022
222cfb9
Merge branch 'master' into feat-instrumentation-5155
JoanFM Oct 6, 2022
dcf7296
fix(runtimes): shutdown instrumentation exporters during teardown
Oct 7, 2022
57be55e
test: spawn processes by default in tests
Oct 7, 2022
7266abc
Merge branch 'feat-instrumentation-5155' of github.com:jina-ai/jina i…
Oct 7, 2022
e9e78ae
fix: provide client and server interceptors only when tracing is ena…
Oct 7, 2022
3656afc
Merge branch 'master' into feat-instrumentation-5155
Oct 7, 2022
4f83c47
fix(serve): correctly handle default instrumentation runtime_args
Oct 7, 2022
a9d5b1b
chore: hide opentelemetry imports under TYPE_CHECKING
Oct 7, 2022
a706480
test: avoid using spawn
JoanFM Oct 7, 2022
ef4a232
fix: add explicit type info and hide imports
Oct 7, 2022
1c0aedd
fix(executors): handle optional runtime_args correctly
Oct 7, 2022
c292234
chore: rename otel_context to tracing_context
Oct 7, 2022
01d543b
feat: use None instead of NoOp tracer and meter implementations
Oct 10, 2022
4afc51b
fix: remove unused import
Oct 10, 2022
70146e4
feat: add default tracing span for DataRequestHandler handle invocation
Oct 10, 2022
7f20c06
test: add test case to verify exception recording in a span
Oct 10, 2022
550a975
fix: use continue_on_error instead of try-except-pass
Oct 10, 2022
b644004
Merge branch 'master' into feat-instrumentation-5155
girishc13 Oct 10, 2022
d55d86c
chore: rename method name to match returning a list
Oct 11, 2022
132a932
fix: rename span_exporter args to traces_exporter
Oct 11, 2022
bb0b003
style: fix overload and cli autocomplete
jina-bot Oct 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions tests/integration/instrumentation/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,3 +73,35 @@ def get_services():
response_json = response.json()
services = response_json.get('data', []) or []
return [service for service in services if service != 'jaeger-query']


class ExecutorFailureWithTracing(Executor):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.failure_counter = 0

@requests(on='/index')
def empty(
self, docs: 'DocumentArray', tracing_context: Optional[Context], **kwargs
):
if self.tracer:
with self.tracer.start_span('dummy', context=tracing_context) as span:
span.set_attribute('len_docs', len(docs))
if not self.failure_counter:
self.failure_counter += 1
raise NotImplementedError
else:
return docs
else:
return docs


def spans_with_error(spans):
error_spans = []
for span in spans:
for tag in span['tags']:
if 'otel.status_code' == tag.get('key', '') and 'ERROR' == tag.get(
'value', ''
):
error_spans.append(span)
return error_spans
38 changes: 38 additions & 0 deletions tests/integration/instrumentation/test_flow_instrumentation.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,13 @@

from jina import Flow
from tests.integration.instrumentation import (
ExecutorFailureWithTracing,
ExecutorTestWithTracing,
get_services,
get_trace_ids,
get_traces,
partition_spans_by_kind,
spans_with_error,
)


Expand Down Expand Up @@ -61,3 +63,39 @@ def test_gateway_instrumentation(

trace_ids = get_trace_ids(client_traces)
assert len(trace_ids) == 1


def test_executor_instrumentation(otlp_collector):
f = Flow(
tracing=True,
span_exporter_host='localhost',
span_exporter_port=4317,
).add(uses=ExecutorFailureWithTracing)

with f:
from jina import DocumentArray

try:
f.post(
girishc13 marked this conversation as resolved.
Show resolved Hide resolved
f'/index',
DocumentArray.empty(2),
)
except:
pass
# give some time for the tracing and metrics exporters to finish exporting.
# the client is slow to export the data
time.sleep(8)

client_type = 'GRPCClient'
client_traces = get_traces(client_type)
(server_spans, client_spans, internal_spans) = partition_spans_by_kind(
client_traces
)
assert len(spans_with_error(server_spans)) == 0
assert len(spans_with_error(client_spans)) == 0
assert len(internal_spans) == 2
# Errors reported by DataRequestHandler and request method level spans
assert len(spans_with_error(internal_spans)) == 2

trace_ids = get_trace_ids(client_traces)
assert len(trace_ids) == 1