Kibana authentication troubleshooting guide #83914

azasypkin · 2020-11-20T13:48:48Z

Kibana authentication sub-system includes quite a bit of functionality these days and it's not always easy to troubleshoot problems or misconfiguration in this area. This issue is intended to gather the most common issues our users are experiencing with our authentication layer and ideas on how we can help them to troubleshoot these issues.

We can tackle this from two different angles:

Provide a proper troubleshooting guide similar to Common SAML issues guide Elasticsearch team created.
Try to detect possible configuration issues in the code and log appropriate warnings.

Most frequent issues

Inconsistent (autogenerated) `xpack.security.encryptionKey` in Kibana HA setup

This is by far the most common source of confusion. If one instance of Kibana cannot decrypt cookie that was created by another instance the cookie will be cleared.

What we already do:

We log a warning on a Kibana startup if key is autogenerated Generating a random key for xpack.security.encryptionKey. To prevent sessions from being invalidated on restart, please set xpack.security.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command.
Core logs the following message when it cannot decrypt cookie: [debug][server][Kibana][cookie-session-storage][http] Error: Unauthorized
We mention that the key should be same in HA setup in our docs

What we can do:

Additionally explain this issue in the troubleshooting guide
Core can log a bit more helpful message if it cannot decrypt cookie

Inconsistent session and authentication settings in Kibana HA setup

Every instance of Kibana schedules a regular session cleanup job to remove sessions that weren't explicitly invalidated. There are number of criteria we use to determine that session can be safely removed, but the most notable are:

If Kibana is configured with a non-0/null session lifespan or idle timeout it will remove all existing sessions that were created without one.
Kibana will also remove all sessions that are created using providers that aren't configured anymore (based on | tuple).

That means that if multiple Kibana instances that rely on the same .kibana-x index have different session or providers settings then a cleanup job scheduled by one Kibana instance may invalidate sessions created by another instance. By default, a cleanup job is run on startup and every hour after that, so users may experience sporadic logouts that may be hard to debug.

What we already do:

When a cleanup job removes sessions it logs a message similar to Cleaned up 5 invalid or expired session values. that we can potentially correlate with user logouts. But it's too vague and may not indicate any problem.

What we can do:

Additionally explain this issue in the troubleshooting guide
We can make cleanup a bit more complex and slow, but specifically log sessions that were removed because of config mismatch, I'd wait till it becomes a problem before doing anything here.

Multi-tenancy using the same host name, but different ports

Per RFC6265 cookies for a given host are shared across all the ports on that host, even though the usual "same-origin policy" used by web browsers isolates content retrieved via different ports. That means that if you have multiple Kibana tenants (Kibana instances that use different .kibana-x indices) that are using the same host name, but different ports then the session cookies will be shared between them.

This will lead to sporadic logouts if both tenants are opened in the same browsing context (same browser window) since if one tenant receives a session cookie that references to a session that lives in another tenant then the cookie will be treated as invalid and Kibana will clear it.

The most correct solution is to never host different applications on the same hostname because of a cookie leak. If that's not possible then the workaround is to configure different session cookie names for every tenant with xpack.security.cookieName setting.

What we already do:

If tenants don't share encryption key, this case will be indistinguishable from Inconsistent (autogenerated) xpack.security.encryptionKey in Kibana HA setup. Otherwise, we detect this case and log the following debug message: Session value is not available in the index, session cookie will be invalidated.

What we can do:

Additionally explain this issue in the troubleshooting guide

Multiple authentication providers without Login Selector

It's still possible to use multiple authentication providers even if Login Selector is disabled. The support is very limited though and we generally discourage our users from that setup. The main reason why we still support this is BWC. There is nothing we can do here, so I'll just outline few notable thing about this setup:

When user opens Kibana only the provider with the lowest order will try to authenticate them
If a provider that uses Kibana native login form (e.g. basic or token right now) is configured in addition to other providers (e.g. saml or oidc), but the order is higher than that of another provider it's still possible to use it to log in even though it's not used automatically. To do this one should log out and go to the /login route directly.
In this setup one can also log in to Kibana using multiple SAML/OIDC providers even if their order isn't the lowest one, but only through IdP or OP initiated login. The caveat here is that user may want to log out from Kibana and in this case they may automatically re-login using the provider with the lowest order instead.

What we can do:

Discourage, discourage, discourage and eventually deprecate

Kibana session settings vs access/refresh token expiration

Many of the Kibana authentication providers use Elasticsearch access/refresh tokens under the hood: SAML, OpenID Connect, PKI, Kerberos and Token. And these tokens also have their own expiration settings, that are separate from Kibana's own session expiration settings:

Default expiration time of the access token is 20 minutes, with a maximum of one hour
Expiration of the refresh token is 24 hours and it's hard-coded

If Kibana's session idle timeout is higher than the expiration time of the underlying access token Kibana will automatically refresh the access token once user becomes active again. But if admin disabled or set Kibana session idle timeout or lifespan higher than 24 hours and user isn't active during this period then underlying refresh token expires and access token cannot be refreshed anymore. Such setup effectively limits Kibana session timeouts to 24 hours.

For example, if Kibana is configured to work with one of the token based authentication providers, and admin wants to disable idle timeout they would do something like this:

xpack.security.session.idleTimeout: 0

But in reality, because of hard-coded 24 hours lifetime of the refresh token, idle timeout will be approximately equal to only 24 hours.

It's even more problematic for the PKI authentication since Elasticsearch doesn't provide refresh token in this case at all effectively limiting idle timeout for the PKI authentication provider to the lifetime of the access token (max 1 hour).

What we already do:

We briefly mention this limitation in our docs and only for SAML and OIDC.

What we can do:

Additionally explain this issue in the troubleshooting guide
Mention this for every affected provider in the main documentation
Fine tune session config schema:
- Don't allow disabling provider specific idle timeout for all token based authentication providers
- Limit max provider specific idle timeout to 24 hours for SAML/OIDC/Kerberos/Token and to 1 hour for PKI with a custom error message
- Detect if any of the two conditions are violated through the global session settings and log a clear warning if so.

Misconfigured role mappings

It's more of an Elasticsearch issue, but it's usually Kibana where user is finally stuck, so we can try to help to debug this.

What we already do:

We display You do not have permission to access the requested page screen that is already a good enough solution.

What we can do:

We can and will eventually disable login if user doesn't have enough privileges to access anything in Kibana.
We might want to document /internal/security/me endpoint in a troubleshooting guide so that admins can see what roles where exactly applied to a particular user.

Misconfigured refresh interval for the security-related indices

Security-related indices (and many other system indices) are very sensible to refresh intervals higher than 1s as most update operations are issued with a wait_for_refresh in order to guarantee concurrent edits.

Changing default refresh intervals for the security-related indices is highly discouraged. Typical causes of this are match all index templates which set some common settings or mappings to all indices, or if user mistakenly sets a common refresh interval to ALL indices.

note This should happen less frequently once #134900 merges (target 8.4.0).

This can lead to significant delays and failures during request authentication. To make sure the security-related indices have proper refresh intervals, you can check settings file in the Elastic support diagnostics bundle:

@elastic/kibana-security I'll be gradually filling this issue with info I remember, but please feel free to comment here or edit issue description to include issues you know about that I missed.

The text was updated successfully, but these errors were encountered:

azasypkin · 2020-11-25T10:42:46Z

Okay, I described all the cases I could remember so far. I'll get back to this issue in a few weeks so that everyone has time to share any other ideas/issues.

legrego · 2020-12-07T13:48:04Z

Thanks for putting this together! I agree with a lot of what you said here, and I don't see any glaring omissions.

Multiple authentication providers without Login Selector

Would it be possible to use the new auth_provider_hint query parameter to attempt authentication?

Discourage, discourage, discourage and eventually deprecate

As much as I'd love to deprecate this, I worry that we will end up having to support this in some capacity.

azasypkin · 2020-12-07T15:02:10Z

Would it be possible to use the new auth_provider_hint query parameter to attempt authentication?

Yeah, it should allow you to pick any provider.

As much as I'd love to deprecate this, I worry that we will end up having to support this in some capacity.

Right, my suspicion is that many users upgrade Kibana and just keep their legacy authc config and hence don't leverage Login Selector by default. And right now our Telemetry cannot tell us whether it's the case or users explicitly disabled Login Selector. In 8.0.0 when we drop legacy config completely we'll be able to see how many users explicitly disable it.

aniketpant1 · 2023-08-07T05:58:13Z

We are currently encountering with this issue. Recently we have integrated Azure AD OIDC realms for authentication in elasticsearch and kibana. Our end users who is using kibana is frequently logging out within 5 minutes 1 hour . They need to re-login again providing email id and password and passcode. If we disable/commented session settings from kibana.yml file it is still logging out. We have escalated this issue to elastic engineers.
We couldn't able to find what causing this frequent logout . Besides of setting xpack.security.session.idleTimeout: "15m" xpack.security.session.lifespan: "24h" it is logging out.

azasypkin · 2023-08-07T08:10:39Z

We have escalated this issue to elastic engineers.

If you've escalated this issue to our support team, we'll look into it soon. If you don't have access to our support, then please post this question at our Discuss forum. There much more users like you that can help and probably already solved the problem you have. The GitHub issue isn't the right place to debug issues like that.

Having said that, I'm almost sure that you have multiple Kibana instances connected to the same cluster that have different security configurations or something along these lines: https://www.elastic.co/guide/en/kibana/current/production.html#load-balancing-kibana

azasypkin added discuss Team:Security Team focused on: Auth, Users, Roles, Spaces, Audit Logging, and more! Feature:Security/Authentication Platform Security - Authentication docs labels Nov 20, 2020

azasypkin mentioned this issue Mar 29, 2021

Update SAML Troubleshooting documentation to include /api/security/v1/me API #68447

Closed

legrego added EnableJiraSync and removed EnableJiraSync labels Aug 4, 2021

exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort labels Aug 5, 2021

jportner mentioned this issue Mar 3, 2022

Create outline for restructuring Security docs for 8.0+ #118283

Open

legrego removed EnableJiraSync loe:small Small Level of Effort impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. labels Aug 18, 2022

azasypkin mentioned this issue Aug 15, 2024

[Security] logging into one Kibana instance logs me out of all the others #189311

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kibana authentication troubleshooting guide #83914

Kibana authentication troubleshooting guide #83914

azasypkin commented Nov 20, 2020 •

edited by legrego

Loading

azasypkin commented Nov 25, 2020

legrego commented Dec 7, 2020

Multiple authentication providers without Login Selector

azasypkin commented Dec 7, 2020 •

edited

Loading

aniketpant1 commented Aug 7, 2023

azasypkin commented Aug 7, 2023

Kibana authentication troubleshooting guide #83914

Kibana authentication troubleshooting guide #83914

Comments

azasypkin commented Nov 20, 2020 • edited by legrego Loading

Most frequent issues

Inconsistent (autogenerated) xpack.security.encryptionKey in Kibana HA setup

Inconsistent session and authentication settings in Kibana HA setup

Multi-tenancy using the same host name, but different ports

Multiple authentication providers without Login Selector

Kibana session settings vs access/refresh token expiration

Misconfigured role mappings

Misconfigured refresh interval for the security-related indices

azasypkin commented Nov 25, 2020

legrego commented Dec 7, 2020

Multiple authentication providers without Login Selector

azasypkin commented Dec 7, 2020 • edited Loading

aniketpant1 commented Aug 7, 2023

azasypkin commented Aug 7, 2023

azasypkin commented Nov 20, 2020 •

edited by legrego

Loading

Inconsistent (autogenerated) `xpack.security.encryptionKey` in Kibana HA setup

azasypkin commented Dec 7, 2020 •

edited

Loading