Skip to content

Commit

Permalink
Duplicated context concept guide
Browse files Browse the repository at this point in the history
  • Loading branch information
cescoffier authored and danielsoro committed Jul 31, 2023
1 parent 101158d commit 6a9685a
Show file tree
Hide file tree
Showing 4 changed files with 238 additions and 1 deletion.
235 changes: 235 additions & 0 deletions docs/src/main/asciidoc/duplicated-context.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,235 @@
////
This document is maintained in the main Quarkus repository
and pull requests should be submitted there:
https://github.com/quarkusio/quarkus/tree/main/docs/src/main/asciidoc
////
[id="duplicated-context"]
= Duplicated context, context locals, asynchronous processing and propagation
include::_attributes.adoc[]
:diataxis-type: concept
:categories: core, architecture

When using a traditional, blocking, and synchronous framework, processing of each request is performed in a dedicated thread.
So, the same thread is used for the entire processing.
You know that this thread will not be used to execute anything else until the completion of the processing.
When you need to propagate data along the processing, such as the security principal or a trace id, you can use `ThreadLocals`.
The propagated data is cleared once the processing is complete.

When using a reactive and asynchronous execution model, you cannot use the same mechanism.
To avoid using many process threads, and reduce resource usage (and also increase the concurrency of the application), the same thread can be used to handle multiple concurrent processing.
Thus, you cannot use `ThreadLocals` as the values would be leaked between the various concurrent processing.

_Duplicated Contexts_ are a construct that provides the same kind of propagation but for asynchronous processing.
It can also be used with synchronous code.

This document explains duplicated contexts, how to retrieve one, use it, and how it's propagated along the (asynchronous) processing.

== The reactive model

NOTE: This section is not an explanation of the reactive model. Refer to that xref:quarkus-reactive-architecture.adoc[the Quarkus Reactive Architecture] for more details.

Under the hood, Quarkus uses a reactive engine.
This engine provides the efficiency and concurrency to cope with modern, containerized, and cloud-native applications.

For example, when you use RESTEasy Reactive or gRPC, Quarkus can invoke your business logic on I/O threads.
These threads are named https://en.wikipedia.org/wiki/Event_loop[event loops] and implement a https://en.wikipedia.org/wiki/Reactor_pattern[multi-reactor pattern].

When using the imperative model, Quarkus associates a worker thread to each processing unit (like an HTTP request or a gRPC invocation).
That thread is dedicated to this specific processing until the processing completes.
Thus, you can use _Thread Locals_ to propagate data along the processing, and no other processing unit will use that thread until the completion of the current one.

With the reactive model, the code runs on event loop threads.
These event loops execute multiple concurrent processing units. F
or example, the same event loop can handle multiple concurrent HTTP requests.
The following picture illustrates this reactive execution model:

image::reactive-continuation.png[alt=Continuation in the reactive execution model, width=80%, align=center]

You must **NEVER** block these event loops.
If you do, the whole model collapses.
Thus, when the processing of an HTTP request needs to execute an I/O operation (such as calling an external service), it:

1. schedules the operation,
2. passes a continuation (the code to call when the I/O completes),
3. releases the thread.

That thread can then handle another concurrent request.
When the scheduled operation completes, it executes the passed continuation **on the same event loop**.

That model is particularly efficient (low number of threads) and performant (avoid context switches).
However, it requires a different development model, and you cannot use _Thread Locals_, as the concurrent requests would see the same value.
Indeed, they are all processed by the same thread: the event loop.

The xref:context-propagation.adoc[MicroProfile Context Propagation] specification addresses this issue.
It saves and restores the values stored in thread locals whenever we switch to another processing unit.
However, that model is expensive.
Context locals (also known as _duplicated context_) is another way to do this and requires fewer mechanics.

== Context and duplicated context

In Quarkus, when you execute reactive code, you run in a _Context_, representing the execution thread (event loop or worker thread).

[source, java]
----
@GET
@NonBlocking // Force the usage of the event loop
@Path("/hello1")
public String hello1() {
Context context = Vertx.currentContext();
return "Hello, you are running on context: %s".formatted(context); <1>
}
@GET
@Path("/hello2")
public String hello2() { // Called on a worker thread (because it has a blocking signature)
Context context = Vertx.currentContext();
return "Hello, you are running on context: %s and on thread %s".formatted(context); <2>
}
----
<1> Produces: `Hello 1, you are running on context: io.vertx.core.impl.DuplicatedContext@5dc42d4f and on thread Thread[vert.x-eventloop-thread-1,5,main]` - so invoked on an event loop.
<2> Produces: `Hello 2, you are running on context: io.vertx.core.impl.DuplicatedContext@41781ccb and on thread Thread[executor-thread-1,5,main]` - so invoked on a worker thread.

With this _Context_ object, you can schedule operations in the same context.
The context is handy for executing the continuation on the same event loop (as contexts are attached to event loops), as described in the picture above.
For example, when you need to call something asynchronous, you capture the current context, and when the response arrives, it invokes the continuation in that context:

[source, java]
----
public Uni<String> invoke() {
Context context = Vertx.currentContext();
return invokeRemoteService()
.emitOn(runnable -> context.runOnContext(runnable)); <1>
}
----
<1> Emit the result in the same context.

IMPORTANT: Most Quarkus clients do this automatically, invoking the continuation in the proper context.

There are two levels of contexts:

- the root contexts representing the event loop, and thus cannot be used to propagate data without leaking it between concurrent processing
- the duplicated contexts, which are based on a root context, but are NOT shared and represent the processing unit

Thus, a duplicated context is associated with each processing unit.
A duplicated context is still associated with a root context, and scheduling operations using a duplicated context run them in the associated root context.
But, unlike the root context, they are not shared between processing units.
Yet, continuations of a processing unit use the same duplicated context.
So, in the previous code snippet, the continuation is not only called on the same event loop but on the same duplicated context (supposing that the captured context is a duplicated context, more on that later).

image::duplicated-context-continuation.png[alt=Continuation with duplicated contexts, width=90%, align=center]

== Context local data

When executed in a duplicated context, the code can store data without sharing it with the other concurrent processing.
So, you can store, retrieve and remove local data.
The continuation invokes on the same duplicated context, will have access to that data:

[source, java]
----
AtomicInteger counter = new AtomicInteger();
public Uni<String> invoke() {
Context context = Vertx.currentContext();
ContextLocals.put("message", "hello");
ContextLocals.put("id", counter.incrementAndGet());
return invokeRemoteService()
.emitOn(runnable -> context.runOnContext(runnable))
.map(res -> {
// Can still access the context local data
// `get(...)` returns an Optional
String msg = ContextLocals.<String>get("message").orElseThrow();
Integer id = ContextLocals.<Integer>get("id").orElseThrow();
return "%s - %s - %d".formatted(res, msg, id);
});
}
----

NOTE: The previous code snippet uses `io.smallrye.common.vertx.ContextLocals`, which eases access to the local data.
You can also access them using `Vertx.currentContext().getLocal("key")`.

Context local data provides an efficient way to propagate objects during a reactive execution.
Tracing metadata, metrics, and sessions can be stored and retrieved safely.

=== Context locals restrictions

However, such a feature must only be used with duplicated contexts.
As said above, it's transparent for the code.
A duplicated context is a context, so they expose the same API.

In Quarkus, access to the local data is restricted to duplicated contexts.
If you try to access local data from a root context, it throws an `UnsupportedOperationException`.
It prevents accessing data shared among different units of processing.

[source, text]
----
java.lang.UnsupportedOperationException: Access to Context.putLocal(), Context.getLocal() and Context.removeLocal() are forbidden from a 'root' context as it can leak data between unrelated processing. Make sure the method runs on a 'duplicated' (local) Context.
----

=== Safe context

You can mark a context _safe_.
It is meant for other extensions to integrate with to help identify which contexts are isolated and guarantee access by a unique thread.
Hibernate Reactive uses this feature to check if the current context is safe to store the currently opened session to protect users from mistakenly interleaving multiple reactive operations that could unintentionally share the same session.

Vert.x web will create a new duplicated context for each http web request; Quarkus RESTEasy Reactive will mark such contexts as safe.
Other extensions should follow a similar pattern when they are setting up a new context that is safe to be used for a local context guaranteeing sequential use, non-concurrent access, and scoped to the current reactive chain as a convenience not to have to pass a "context" object along explicitly.

In other cases, it might be helpful to mark the current context as not safe instead explicitly; for example, if an existing context needs to be shared across multiple workers to process some operations in parallel: by marking and un-marking appropriately the same
context can have spans in which it's safe, followed by spans in which it's not safe.

By using the `io.quarkus.vertx.core.runtime.context.VertxContextSafetyToggle` class, the current context can be explicitly marked as `safe`, or it can be explicitly marked as `unsafe`; there's a third state which is the default of any new context: `unmarked`.
The default is to consider any unmarked context to be `unsafe`, unless the system property `io.quarkus.vertx.core.runtime.context.VertxContextSafetyToggle.UNRESTRICTED_BY_DEFAULT` is set to `true`;

== Extensions supporting duplicated contexts

In general, Quarkus invokes reactive code on duplicated contexts.
So it can safely access the local data.
It is the case with:

- RESTEasy Reactive
- gRPC
- Reactive Routes
- `@ConsumeEvent`
- Reactive REST Client
- Reactive Messaging (Kafka, AMQP)
- Funqy
- Quarkus scheduler and Quartz
- Redis Client (for the pub/sub commands)
- GraphQL

== Distinguish between root and duplicated contexts
You can distinguish between a root and a duplicated context using the following:

[source, java]
----
boolean isDuplicated = VertxContext.isDuplicatedContext(context);
----

This code uses the `io.smallrye.common.vertx.VertxContext` helper class.

=== Duplicated contexts and mapped diagnostic context (MDC)

When using loggers, the MDC (contextual data added to the log messages) is stored in duplicated context when available.
Check the xref:logging.adoc#mdc-propagation[logging reference guide] for more details.

=== CDI request scope

In Quarkus, the CDI request scope is stored in a duplicated context, meaning it gets automatically propagated alongside the reactive processing of a request.

=== Reactive Messaging

The Kafka and AMQP connector creates a duplicated context for each message, implementing a _message context_.
This message context is used for the complete message processing and thus can be used to propagate data.

See the https://smallrye.io/smallrye-reactive-messaging/latest/concepts/message-context/[Message Context] documentation for further information.

=== OpenTelemetry

The OpenTelemetry extension stores the traces in the duplicated context, ensuring propagation even when using reactive and asynchronous code.

== References

- xref:quarkus-reactive-architecture.adoc[Quarkus Reactive Architecture]
- https://vertx.io/docs/vertx-core/java/#_reactor_and_multi_reactor[Vert.x Reactor and Multi-Reactor documentation]
- xref:logging.adoc[Quarkus logging]
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 3 additions & 1 deletion docs/src/main/asciidoc/logging.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@ This guide is maintained in the main Quarkus repository
and pull requests should be submitted there:
https://github.com/quarkusio/quarkus/tree/main/docs/src/main/asciidoc
////
[id="logging"]
= Logging configuration
include::_attributes.adoc[]
:categories: core,getting-started
:diataxis-type: reference

Read about the efficient use of logging API in Quarkus, configuring logging output according to your needs, and using logging adapters to unify the output from other logging APIs.
Read about the use of logging API in Quarkus, configuring logging output, and using logging adapters to unify the output from other logging APIs.

Quarkus uses the JBoss Log Manager logging backend for publishing application and framework logs.
Quarkus supports the JBoss Logging API as well as multiple other logging APIs, seamlessly integrated with JBoss Log Manager.
Expand Down Expand Up @@ -728,6 +729,7 @@ However, the APIs are very similar:
* log4j 2 - `org.apache.logging.log4j.ThreadContext.put(key, value)`
* slf4j - `org.slf4j.MDC.put(key, value)`

[[mdc-propagation]]
==== MDC propagation

Under the hood, Quarkus provides a specific implementation of the MDC provider handling the reactive context.
Expand Down

0 comments on commit 6a9685a

Please sign in to comment.