Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adminui: named cookie not present #41918

Closed
DuskEagle opened this issue Oct 25, 2019 · 4 comments
Closed

adminui: named cookie not present #41918

DuskEagle opened this issue Oct 25, 2019 · 4 comments
Labels
A-webui Triage label for DB Console (fka admin UI) issues. Add this if nothing else is clear.

Comments

@DuskEagle
Copy link
Member

Describe the problem

When attempting to login to the AdminUI, in some circumstances we observe a "Service Unavailable" error that prevents us from logging into the UI. Any existing sessions in the AdminUI remain valid, but metrics do not appear in any of the graphs for those sessions. When this happens, the logs for the nodes contain a large number of errors of the form server/authentication.go:373 Web session error: http: named cookie not present. The rest of the database continues to work as expected.

The issue is fixed by restarting the CockroachDB node.

To Reproduce

Right before this happened, 1000s of new SQL connections were rapidly opened and closed, as can be seen in this graph of active SQL connections:

Screen Shot 2019-10-24 at 8 40 48 PM

However, the Admin UI remained in a broken state for several hours after this flood of connections ended. The only fix was to restart CockroachDB on each node in the cluster.

Environment:

  • CockroachDB v19.1.4
  • Server OS: Ubuntu
@DuskEagle DuskEagle added the A-webui Triage label for DB Console (fka admin UI) issues. Add this if nothing else is clear. label Oct 25, 2019
@knz
Copy link
Contributor

knz commented Nov 19, 2019

related to #42161

@DuskEagle
Copy link
Member Author

The issue appears to be that we create a gRPC connection at server startup, but never redial it if the connection is closed.

@knz
Copy link
Contributor

knz commented Nov 27, 2019

Can you explain how this particular connection can be closed at all? It seems to me it's an intra-process thing and I don't see a Close() call.

@DuskEagle
Copy link
Member Author

This seems like a duplicate of #42828, which goes into a bit more detail. We don't know exactly what causes this connection to close. In the case I observed here, the cluster was under extremely heavy load due to a bad application deployment, while in the case of #42828 a large amount of ranges were transiently unavailable due to CRDB node crashes which occurred shortly after an upgrade from v19.1.5 to v19.2.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-webui Triage label for DB Console (fka admin UI) issues. Add this if nothing else is clear.
Projects
None yet
Development

No branches or pull requests

2 participants