Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request getSession() method throws IllegalStateException when Session exists #4888

Closed
cshannon opened this issue May 19, 2020 · 28 comments · Fixed by #5220
Closed

Request getSession() method throws IllegalStateException when Session exists #4888

cshannon opened this issue May 19, 2020 · 28 comments · Fixed by #5220
Assignees

Comments

@cshannon
Copy link

Jetty version: 9.4.27
Java version: Java 11
OS type/version: Fedora 31

Description

An existing issue, #4156, attempted to fix this problem but seems to only have fixed it in some cases. I am still seeing a very similar stack trace as #4156 when the server is trying to add a Session to the cache that is found to already exist. The same IllegalStateException is thrown and and a null session is returned leading to a NPE later in the chain.

The main difference is this is occurring when calling Request#getSession(true) directly from a custom Servlet filter to access a Session. I'e also seen it happen with other libraries that use filters like Spring Security. The exact cause is hard to pinpoint and reproduce because it appears to be a race condition when several requests come in at once. I have noticed it primarily on server restarts or Session expirations when a browser submits a bunch of requests back to the server with an existing id set and it has to be recreated.

Proposed Solution:

I think the best approach would be to change the cache add so that when Request#getSession() is called and an existing session is found it will be used instead of logging an exception and returning null. A new method should be added to the Session cache that can simply use putIfAbsent and return an existing Session if already there. (Note that in 9.4.20 a new Session was actually still returned which I think is wrong)

While it's somewhat hard to reproduce the issue because I believe there's a race condition, I have written a test that directly uses the SessionHandler to replicate the condition that I am seeing to validate my fix. I will be submitting a PR with my test and fix shortly.

@janbartel janbartel self-assigned this May 19, 2020
cshannon pushed a commit to cshannon/jetty.project that referenced this issue May 19, 2020
…ession

if found

Instead of logging and exception and returning null, if an existing
Session is found in the session cache when attempting to add a new
session then the existing session is used instead and returned
@janbartel
Copy link
Contributor

Please also post your log output that shows the stack traces.

@cshannon
Copy link
Author

I am unable to post the stack trace output as it's on a proprietary system so I'm not authorized to do so.

cshannon added a commit to cshannon/jetty.project that referenced this issue May 19, 2020
…ession if found

Instead of logging and exception and returning null, if an existing
Session is found in the session cache when attempting to add a new
session then the existing session is used instead and returned

Signed-off-by: Christopher L. Shannon <christopher.l.shannon@gmail.com>
@cshannon
Copy link
Author

PR submitted: #4889

@janbartel
Copy link
Contributor

@cshannon any way you can sanitize the log so you would be permitted to post at least parts of it? Ideally I'd like to see log output with org.eclipse.jetty.server.session log level set to DEBUG so it's possible to trace the lifecycle of session ids.

I'm very interested in getting a reproduction for this so we can thoroughly understand the race condition: I would rather that we made code changes only after we really understand the problem, otherwise we're patching in the dark and may cause a lot of unintended consequences (the servlet spec sessions are a real hornets nest!).

@janbartel
Copy link
Contributor

Also, in the application where you've seen this happen, can you describe any more about the handling of requests? You mention it happens when there is a flurry of requests to the server, IIUC these requests all have a cookie with a session id that used to exist but is now no longer existing (or valid?). Is there any forwarding/including or async handling in the path of the handling of these requests?

@cshannon
Copy link
Author

cshannon commented May 19, 2020

There may be some forwarding, I'm not sure as we aren't implementing it and are using Spring MVC and Spring Security so under the covers their filters and dispatcher servlets may be doing some async things to contribute to this. As I said, it seems to primarily happen on session expiration when the browser fires off a bunch of requests at the same time for the application but I haven't been able to reproduce it on demand yet. But it is happening pretty often in our logs.

Also, it does seem to happen sometimes on server restart so in that case it would be requests with the same cookie with a session id that no longer exists as they come from the same browser session that is opened. Usually it's the Spring Security filter that blows up when trying to access the session.

In terms of sanitizing the log, I was able to quickly reproduce the same stack trace that I am seeing from a unit test. This leaves out the spring/my custom filters but demonstrates what I'm seeing.

`java.lang.IllegalStateException: Session node05mh7i3r1hgeu1j44ne3yw04ox0 already in cache
at org.eclipse.jetty.server.session.AbstractSessionCache.add(AbstractSessionCache.java:461)
at org.eclipse.jetty.server.session.SessionHandler.newHttpSession(SessionHandler.java:769)
at org.eclipse.jetty.server.Request.getSession(Request.java:1607)
at org.eclipse.jetty.server.Request.getSession(Request.java:1581)
at org.eclipse.jetty.server.session.RequestDispatchedSessionTest$ForwardFilter.doFilter(RequestDispatchedSessionTest.java:164)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1604)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:59)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:500)
at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
at java.base/java.lang.Thread.run(Thread.java:834)

2020-05-19 09:07:10.456:WARN:oejs.HttpChannel:qtp22805895-21: /login
java.lang.NullPointerException
at org.eclipse.jetty.server.session.SessionHandler.getExtendedId(SessionHandler.java:749)
at org.eclipse.jetty.server.session.SessionHandler.getSessionCookie(SessionHandler.java:661)
at org.eclipse.jetty.server.Request.getSession(Request.java:1608)
at org.eclipse.jetty.server.Request.getSession(Request.java:1581)
at org.eclipse.jetty.server.session.RequestDispatchedSessionTest$ForwardFilter.doFilter(RequestDispatchedSessionTest.java:164)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1604)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:59)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:500)
at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)
at java.base/java.lang.Thread.run(Thread.java:834)`

@janbartel
Copy link
Contributor

Can I see the source of the test please?

@cshannon
Copy link
Author

The source for the test isn't very useful as it was just created to get a sample stack trace and not to demonstrate the error happening. For the test I just took an existing unit test in Jetty (RequestDispatchedSessionTest) and added a filter to make a call to Request.getSession(). Since it's not easy to reproduce I temporarily changed the condition inside AbstractSessionCache in the add method to always throw that IllegalStateException so it would replicate the stack trace.

If really necessary I can try and get the logs with the real stack traces (sanitized of course) that include the Spring filters and post them here but I won't be back in the office again until next week when I can access that system again so it would be next week. However, the stack trace above is a good representation of it so I'm not sure the real stack traces would provide anymore value.

@cshannon
Copy link
Author

@janbartel - From what I can tell, I think the race condition is happening for the requests between the doScope() call of SessionHandler and the call to Request#getSession().

If multiple requests come in for an existing sessionId (same cookie) at the same time (server restart, invalidation etc) then when they hit the SessionHandler the doScope() method is supposed to set the existing Session on the request. However at the time the requests hit doScope() the session doesn't exist so the Request has a null _session variable.

However, one of the requests finishes first and creates the session so when another request gets to the getSession() method inside Request it sees that the local variable _session is null and tries to create a new one and gets the error as it is using the same id from the cookie.

I could be wrong but that seems to be the behavior I'm seeing. I am still trying to figure out a way to replicate the issue in a test for real. My PR has a test but as I said it simulates the issue to verify the fix works but doesn't demonstrate the race condition that I think is there.

@cshannon
Copy link
Author

cshannon commented May 19, 2020

Also, if you don't like returning the existing session as my patch does, it could be simplified to go back to the behavior of 9.4.20 and always return a new session but just not add it to the cache. If you run my test that is part of #4889 against 9.4.20 you will see it will fail but it fails because the two sessions returned are not the same object but they share the same Id. This behavior has worked for our application for many years and only after updating when the cache implementation change did this issue show up.

@janbartel
Copy link
Contributor

Well, that behaviour was wrong, which is why we fixed it. Let's see if we can't come up with a solution that works also for your application.

I would really like to see those (sanitized) logs.

I'd also like a general description of how you're using spring security, and any config params that specifically deal with sessions.

@cshannon
Copy link
Author

Right, I assumed the behavior of returning a new Session when one exists is wrong which is why my patch will return the existing Session instead of null. It makes sense to me to return the existing Session if we have it vs logging an error and returning null.

I can see what I can do next week with the logs, but still may not be able to provide a version easily. The issue is most noticeable with Spring Security but I was also able to get it to happen on server restarts by making the call to getSession() in my own filter last week. (still using Spring MVC but not Spring Security) That stack trace without Spring Security more or less replicates what I already pasted in this issue minus the Spring MVC parts. So I may try and see if if I can provide a stack trace with/without Spring Security to see the difference. As I said, seems to happen sometimes on session expiration and sometimes on server restart when the application is still open and hitting the server.

For your other question, general description is Spring security is set up for filtering on URLs and sessions are enabled. So every request gets authenticated and Spring security also calls off to the session (I assume to cache data in there). Concurrent sessions are enabled so part of the stack trace shows the ConcurrentSessionControlStrategy in use. (not sure if something with this has anything to do with it in terms of making it more likely or not to happen)

Obviously the best thing might be to figure out a way to write a test to reproduce the issue but that may not be easy if it's truly a race condition unfortunately.

@janbartel
Copy link
Contributor

@cshannon ideally it would be good to get a simple webapp that can at least sometimes reproduce the issue and then we can work with that.

Going back to your app, I'd also like to know what the settings are for spring security, can you please post the config selections matching the ones mentioned on this page: https://www.baeldung.com/spring-security-session. I am sceptical about this so called "concurrent sessions" approach - I will need to look into the code of spring security to know what they actually mean by that.

@cshannon
Copy link
Author

@janbartel - I agree, next week when I'm back in the office I will do some more digging and see if i can narrow down the root cause to create a reproducible test case. One thing I completely forgot to mention is there is also a WebSocket connection. Not sure if that is part of it or not (nothing from Websockets has shown up in the stack trace) but sessions could be involved there too.

In terms of Spring Security, the session configuration is set to "always" and concurrent sessions are set to 5. There is a session timeout configured through the web.xml and is pretty short (i think a few minutes). Ideally I'd like to come up with a test case without Spring and Spring Security to make it easier to debug.

In the meantime it looks like Jetty allows plugging in a custom SessionHandler and SessionCache pretty easily so I can probably override the defaults with something that works for my use case to get past the issue until figured out.

@cshannon
Copy link
Author

Quick update, I haven't been able to come up with a webapp yet to reliably reproduce the issue. I have to move on to some other things but I'll try and get back to it when I have time.

In terms of logs, it's still going to be a bit tough to move the logs but as I said, the stack trace I already posted it exactly what I'm seeing so the most useful thing would be a reliable test case.

In the meantime I have come up with a workaround for my use case to get around the issue.

@gregw
Copy link
Contributor

gregw commented Sep 1, 2020

I'm just tuning in to this issue to see if I can help.
It should not be possible to race on Request._session value as there should only ever be a single thread calling getSession for a request... at least with normal dispatching.
@cshannon do you know if your app has async threads that are passed the request object and may call getSession on them? Also we are coming into a release cycle, so if you can give this issue some more time this week then it is more likely that a fix will make it into the next release.

@janbartel the Request.getSession(boolean) method is definitely not thread safe and has never been so.... but I can't convince myself that apps are not allowed to call this from other threads. It may be that we need a lock on this method??? But ultimately, I totally agree that we need to understand this issue before attempting fixes. The solution to putIfAbsent looks like treating a symptom rather than a cause.

@gregw
Copy link
Contributor

gregw commented Sep 1, 2020

Of course the problem with allowing an async thread to call getSession is that if a request is not dispatched then you don't know what context to get the session from.

@gregw
Copy link
Contributor

gregw commented Sep 1, 2020

@janbartel and I have walked through this and there is something about your usage that we just don't understand:

If multiple requests come in for an existing sessionId (same cookie) at the same time (server restart, invalidation etc) then when they hit the SessionHandler the doScope() method is supposed to set the existing Session on the request. However at the time the requests hit doScope() the session doesn't exist so the Request has a null _session variable.

If a session does not exist, then a new one must not be made with the same session id - as that could result in all sorts of security issues. We have only very limited cases in which we will reuse a session ID and they basically are involved only with cross context dispatch. We will reuse a request session ID IFF the session is in use or this request has already created a session with the ID in another context.

However, one of the requests finishes first and creates the session so when another request gets to the getSession() method inside Request it sees that the local variable _session is null and tries to create a new one and gets the error as it is using the same id from the cookie.

Once a request passes doScope with a null _session, then a session for that ID should never EVER be created. So we have to work out how your setup is creating a new session for the old session ID.

What SessionCache and SessionDataStore are you using?

How is it that a session ID that didn't exist suddenly exists again???

@manovotn
Copy link

manovotn commented Sep 1, 2020

@gregw I think I might have bumped into a similar error just today with Weld servlet test using Jetty 9.4.31.

The test in question is this one.
Here is what the console is spitting out for me https://pastebin.com/NTfu8Y3q.
It should be using embedded jetty and here is how to reproduce that:

FTR the test used to work with older Jetty (tests were using 9.4.14 before).
Maybe it can help you somehow, although I haven't really had time to delve into it.

@janbartel
Copy link
Contributor

@manovotn thanks for that repro test case and stack trace. I think I can see what is happening, but you will have to confirm my theory with your greater knowledge of weld.

I believe the request comes in either without a session id or at least an unknown session id. The app calls Request.getSession(true) and we are in the process of creating the new session [A]. We have to call any HttpSessionListeners after we've made the session. One of these listeners has a reference to the Request, and again calls Request.getSession(true). As we have not yet returned from creating the session at [A] the Request does not know about this new session, so will attempt to create another one [B]. When we made the first session at [A], we set the new id we are using onto the Request as the __NEW_SESSION_ID attribute so that it can be retrieved by any later cross-context dispatches that want to create a session. So now at [B], we check the __NEW_SESSION_ID attribute and voila, it is set so we try to reuse it!

This all hinges on the HttpSessionListener having a path to referencing the Request - is this possible in weld?

janbartel added a commit that referenced this issue Sep 1, 2020
Request.getSession()

Signed-off-by: Jan Bartel <janb@webtide.com>
@janbartel
Copy link
Contributor

@manovotn if you have time, you could checkout my PR and build against that and see if that fixes your test. See PR #5220

@manovotn
Copy link

manovotn commented Sep 1, 2020

I believe the request comes in either without a session id or at least an unknown session id. The app calls Request.getSession(true) and we are in the process of creating the new session [A]. We have to call any HttpSessionListeners after we've made the session. One of these listeners has a reference to the Request, and again calls Request.getSession(true). As we have not yet returned from creating the session at [A] the Request does not know about this new session, so will attempt to create another one [B]. When we made the first session at [A], we set the new id we are using onto the Request as the __NEW_SESSION_ID attribute so that it can be retrieved by any later cross-context dispatches that want to create a session. So now at [B], we check the __NEW_SESSION_ID attribute and voila, it is set so we try to reuse it!

This all hinges on the HttpSessionListener having a path to referencing the Request - is this possible in weld?

I'll have to debug it to be able to say more. The app (the test archive itself) is pretty dumb and doesn't itself create session. What's going to create session is Weld trying to initialize session context to store session bean in (Pinger class is a session scoped bean). And the test is somehow triggering that repeatedly via additional listener and CDI observer which eagerly uses that bean. But like I said, without debug, I can't be sure.

@manovotn if you have time, you could checkout my PR and build against that and see if that fixes your test. See PR #5220

I'll try tomorrow (have to leave now, it's getting a bit late), anything special I need to know/do to build jetty snapshot?

@joakime
Copy link
Contributor

joakime commented Sep 1, 2020

anything special I need to know/do to build jetty snapshot?

$ git clone https://github.com/eclipse/jetty.project.git
$ cd jetty.project
# switch to the branch that the PR is based off of
$ git checkout jetty-9.4.x-4888-NPE-request-get-session
$ mvn clean install -DskipTests

At this point you have it compiled against the branch that PR #5220 is based on.

You can find the artifacts, including jetty-home (and the older jetty-distribution) in your $HOME/.m2/repository

Use version 9.4.32-SNAPSHOT when testing it.

janbartel added a commit that referenced this issue Sep 2, 2020
Issue #4888 Ensure HttpSessionListener can call Request.getSession

Signed-off-by: Jan Bartel <janb@webtide.com>
@manovotn
Copy link

manovotn commented Sep 2, 2020

@janbartel only just got to it. I've run Weld test with your fix and can confirm that it works.

@janbartel
Copy link
Contributor

@cshannon and @manovotn where are we at with this issue? AFAIK, the commit I did in PR #4889 fixes both @manovotn and @cshannon's issues. Can I close this issue?

@manovotn
Copy link

manovotn commented Mar 5, 2021

Yeah, I think this issue should be closed.

@cshannon
Copy link
Author

cshannon commented Mar 5, 2021

You can close the issue, as I believe it's solved now. I implemented a workaround before so I would need to go back and remove that to specifically test the fix but based on looking at it plus with Weld being fixed I think this should be closed. If something else pops up again I'll open a new issue as that would mean a different cause anyways.

@gregw
Copy link
Contributor

gregw commented Mar 8, 2021

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants