-
Notifications
You must be signed in to change notification settings - Fork 38.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Application hangs on startup after upgrading to Spring Boot 3.2.1 with spring-data-redis and micrometer-prometheus dependencies #32996
Comments
This appears to be a duplicate of spring-projects/spring-data-redis#2814, which the Micrometer team is trying to get to the bottom of. @Zernov-A can you do a thread dump when the application hangs and confirm that it looks similar to the one in that issue? |
Thank you. It's really very similar. |
Thanks @Zernov-A. The thread dumps are very similar, with the same libraries in the call stack. I'll close this issue in favor of the Spring Data Redis issue that is already being investigated. Please follow that issue for updates. |
@scottfrederick My team is facing a similar issue in production. It seems that currently both this issue and spring-projects/spring-data-redis#2814 were closed and labelled as for external-project. May I know if there's any team working on this bug? If so, can you point me to that team please? Thank you |
@jhengy, please capture a thread dump from the hung application and we should then be able to point you in the right direction. |
Hi @wilkinsona, thanks for the prompt reply, I've created a simple reproducer and also captured the thread dump. Note that for our case, unlike the above, we did not enable keyspace events for redis on startup, what triggered this issue happens to be a bean definition for |
Thanks, @jhengy. As you've seen, you can work around this issue with things like We'll transfer this to the Framework team so that they can investigate further. Tagging @mp911de as well as Lettuce is heavily involved here. I think what's happening is that creating the Redis message listener container on the |
Given that we revised the core container towards lenient locking for 6.2 already (#23501), does this scenario work against 6.2.0-M3? I'm afraid there is nothing we can do in 6.1.x as a quick measure, I rather recommend working around this through not waiting for the result of any asynchronous steps in the bean initialization method itself. I recommend doing so wherever possible even when running against 6.2, actually. |
I'm not sure. Upgrading the sample to 6.2.0-M3 results in the following failure:
@snicoll may be able to give us a feel for it with his branch of Spring Boot that contains the necessary changes for Framework 6.2 compatibility. I say "a feel for it", as whether or not the problem occurs is quite delicate. For example, it does not occur in the supplied sample when the seemingly unrelated
This would require a change in Spring Data Redis where creating the instance of |
Alternatively, the arrangement may also enforce all of the required singleton beans to be initialized beforehand. Accessing an existing singleton bean does not require the singleton lock, that's just necessary for creating it on demand. |
That would require changes in Micrometer (Tracing). Boot currently defers the creation of the /cc @jonatan-ivanov |
I can confirm the sample fails as Andy described with Spring Boot The problem is in |
@jhoeller with |
Thanks, @snicoll! From a Spring Framework perspective, we seem to have a strategic solution for these kinds of scenarios as of 6.2 then. Against 6.1.x, it will have to be on the workarounds suggested above, either in Spring Data or in Micrometer. As mentioned, it is generally recommendable to relax such asynchronous init method arrangements and not rely on the container's locking behavior there. Any such change made in Spring Data or Micrometer will make sense against 6.2 as well, even if not strictly necessary there. |
@jonatan-ivanov @mp911de, can you please take a look at this? With spring-projects/spring-data-redis#2814 having previously been closed in favor of a Boot issue, we're in danger of going round in circles without identifying a fix. |
Spring Data Redis isn't involved much. Tracing is configured on the Lettuce driver through Any asynchronous database activity can suffer from that arrangement; Database activity during the container bootstrap happens on a regular basis and the number of drivers utilizing netty or any other async approach is growing. From that perspective, I would welcome anything that removes the need for asynchronously fetching beans from a container that is being started. |
Until a fix in Framework is available, I think we may have to make changes in several areas to get things working fully.
Is this suggestion from spring-projects/spring-data-redis#2814 not viable? Similarly, avoiding the connection retrieval in the constructor of |
From a Framework perspective, I think this issue can be closed. There's a fix coming in 6.2 and nothing, AIUI, that can be done in 6.1.x. We've got spring-projects/spring-boot#40972 that's tracking the problem on the Boot side (although it may need to move to Micrometer (Tracing) as Boot's code is just trying to work around the |
Thanks for the summary, Andy. |
@jhengy, judging by your reproducer, you may be able to work around the problem by defining an eager @Bean
TracerSpanContextSupplier spanContextSuppler(Tracer tracer) {
return new TracerSpanContextSupplier(tracer);
}
static class TracerSpanContextSupplier implements SpanContextSupplier {
private final Tracer tracer;
TracerSpanContextSupplier(Tracer tracer) {
this.tracer = tracer;
}
@Override
public String getTraceId() {
Span currentSpan = currentSpan();
return (currentSpan != null) ? currentSpan.context().traceId() : null;
}
@Override
public String getSpanId() {
Span currentSpan = currentSpan();
return (currentSpan != null) ? currentSpan.context().spanId() : null;
}
@Override
public boolean isSampled() {
Span currentSpan = currentSpan();
if (currentSpan == null) {
return false;
}
Boolean sampled = currentSpan.context().sampled();
return sampled != null && sampled;
}
private Span currentSpan() {
return this.tracer.currentSpan();
}
} It starts up successfully with this bean in place. |
For spring boot 3.3, since the This seems to work for me: @Bean
SpanContext spanContext(Tracer tracer) {
return new TracingSpanContext(tracer);
}
static class TracingSpanContext implements SpanContext {
private final Tracer tracer;
TracingSpanContext(Tracer tracer) {
this.tracer = tracer;
}
@Override
public String getCurrentTraceId() {
Span currentSpan = currentSpan();
return (currentSpan != null) ? currentSpan.context().traceId() : null;
}
@Override
public String getCurrentSpanId() {
Span currentSpan = currentSpan();
return (currentSpan != null) ? currentSpan.context().spanId() : null;
}
@Override
public boolean isCurrentSpanSampled() {
Span currentSpan = currentSpan();
if (currentSpan == null) {
return false;
}
Boolean sampled = currentSpan.context().sampled();
return sampled != null && sampled;
}
@Override
public void markCurrentSpanAsExemplar() {
}
private Span currentSpan() {
return this.tracer.currentSpan();
}
} |
Would it be possible to allow disabling exemplar support entirely @jonatan-ivanov ? I see |
In Micrometer, if you don't provide You can also create your own You can also create your own It's a hack but Micrometer and Boot also supports some of the Prometheus Client Properties, if you set If you want to exclude dependencies, you need to exclude Micrometer Tracing otherwise auto-configuration will try to create a |
After upgrading spring-boot to 3.2.1, I encountered an issue where the application hangs on startup. This occurs specifically when using spring-data-redis with enabled keyspace events on startup, in conjunction with the micrometer-prometheus dependencies.
Steps to Reproduce
I have prepared minimal code to reproduce this issue. Below is a simplified version of the build.gradle and DemoApp.java files:
build.gradle
DemoApp.java
Observations
Additional Information
In a typical scenario, the project includes entity and repository classes, along with other dependencies. However, the provided code is a minimal representation.
Removing
enableKeyspaceEvents = RedisKeyValueAdapter.EnableKeyspaceEvents.ON_STARTUP
or changing its value toON_DEMAND
orOFF
allows the application to start correctly.Removing any of the dependencies results in the application starting correctly.
Disabling Prometheus metrics export by setting
management.prometheus.metrics.export.enabled=false
inapplication.properties
also allows the application to start correctly.The text was updated successfully, but these errors were encountered: