Log the initial line of authentication and login errors at `warn` level #1378

micahlee · 2020-02-28T16:29:29Z

Pending

Security Review Sign-off

What does this PR do?

The generic 401 error response from authenticators (specifically authn-k8s) can have a variety of root causes, some of them not directly related to the authentication credentials themselves, but rather issues with the authenticator configuration.

This change improves the ability for a Conjur operator to quickly pinpoint the cause for an authentication failure, without changing log levels and restarting the Conjur process/container.

Any background context you want to provide?

This is a follow-up issue to #1219.

What ticket does this PR close?

Connected to #1377

Has the Version and Changelog been updated?

Yes

The generic 401 error response from authenticators (specifically authn-k8s) can have a variety of root causes, some of them not directly related to the authentication credentials themselves, but rather issues with the authenticator configuration. This change improves the ability for a Conjur operator to quickly pinpoint the cause for an authentication failure, without changing log levels and restarting the Conjur process/container.

micahlee · 2020-02-28T16:34:14Z

Hey @andytinkham and @shaharglazner,

I've created this draft PR to continue the discussion of raising the log level for the top line message from authentication failures (401) from debug to warn. This was initially included in a previous PR, but removed from that PR because we decided it was worth thinking more about any security implications of this change (See this comment thread for more context).

When you're able, would you please advise on the best way to continue for a security review, and what information you would like to continue progressing this forward?

Thanks so much!

CC: @orenbm @jtuttle @sgnn7 @alexkalish @jvanderhoof

micahlee · 2020-02-28T16:37:17Z

My initial observations on the subject:

The particular content for log messages comes from the Authenticator implementations.
The destination for these log messages is the normal Conjur server log (or DAP node log), typically accessed with docker logs <container_name>.

codeclimate · 2020-02-28T16:47:58Z

Code Climate has analyzed commit b15d2c3 and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 100.0% (50% is the threshold).

This pull request will bring the total coverage in the repository to 88.5% (0.0% change).

View more on Code Climate.

izgeri · 2020-02-28T22:43:03Z

@micahlee can you provide some examples of the kinds of log messages you might see once this change is merged in? for example, in the authenticator troubleshooting we have identified a few really common problems - I'm curious whether there would be additional log output visible when people encounter these problems, which might make troubleshooting them (without changing the default log level to debug) easier

orenbm · 2020-03-03T19:41:42Z

The particular content for log messages comes from the Authenticator implementations.

@micahlee I'm afraid that's not entirely correct. In the JWT decoder used in authn-oidc and authn-azure we have the following errors:

        TokenDecodeFailed = ::Util::TrackableErrorClass.new(
          msg: "Token decode failed (3rdPartyError ='{0}')",
          code: "CONJ00035E"
        )

        TokenVerificationFailed = ::Util::TrackableErrorClass.new(
          msg: "Token verification failed (3rdPartyError ='{0}')",
          code: "CONJ00015E"
        )

So we also log error messages that come from outside of our code.

orenbm · 2020-03-10T12:50:46Z

@micahlee I'm afraid that's not entirely correct. In the JWT decoder used in authn-oidc and authn-azure we have the following errors:

    TokenDecodeFailed = ::Util::TrackableErrorClass.new(
      msg: "Token decode failed (3rdPartyError ='{0}')",
      code: "CONJ00035E"
    )

    TokenVerificationFailed = ::Util::TrackableErrorClass.new(
      msg: "Token verification failed (3rdPartyError ='{0}')",
      code: "CONJ00015E"
    )

So we also log error messages that come from outside of our code.

However, I looked into the errors that are raised by the 3rd party and it doesn't include any sensitive data.

@micahlee @shaharglazner ^

andytinkham · 2020-03-31T16:58:27Z

From a security standpoint, here's what I'd like to see us do for this change. We can make a list of the errors we know could be thrown. It's likely we won't get all of them, but we can get many, including all the ones we control. Review those errors individually and determine the sensitivity of the information in the error. This could be helped by making tests that address our handling of each error (which would be good to have anyhow), particularly if we made all our sensitive information conform to a pattern and then grepped the logs for that pattern after all our test runs.

More broadly, it would be really helpful generally to build out a tiered risk model of all our sensitive info - what do we have that is sensitive? Given a set of bad actor personas (e.g., disgruntled internal attacker who already has some access, external attacker, etc.), what could they do with each piece of sensitive data if they had access to it. I suspect we'd come up with something like secret values are the highest risk, then a tier that has less severe consequences and a tier with even less severe consquences and so on. This would give us a useful risk model that would be reusable any time we were touching data.

orenbm · 2020-03-31T17:42:57Z

thanks @andytinkham !
I think this effort is a good opportunity to get all of the logs and errors that we write into the centralized files for errors and log messages. this will make it easier for us to know what messages, and data, we write to our logs.

micahlee · 2020-03-31T18:14:08Z

Thanks @andytinkham, I'm closing this PR for now until we can address your suggestions under a new issue: #1464

micahlee self-assigned this Feb 28, 2020

jvanderhoof added the needs-review/security label Mar 31, 2020

jvanderhoof assigned andytinkham Mar 31, 2020

micahlee closed this Mar 31, 2020

micahlee deleted the 1377-authn-log-level branch March 11, 2021 18:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log the initial line of authentication and login errors at `warn` level #1378

Log the initial line of authentication and login errors at `warn` level #1378

micahlee commented Feb 28, 2020 •

edited

Loading

micahlee commented Feb 28, 2020 •

edited

Loading

micahlee commented Feb 28, 2020

codeclimate bot commented Feb 28, 2020

izgeri commented Feb 28, 2020

orenbm commented Mar 3, 2020 •

edited

Loading

orenbm commented Mar 10, 2020

andytinkham commented Mar 31, 2020

orenbm commented Mar 31, 2020

micahlee commented Mar 31, 2020

Log the initial line of authentication and login errors at warn level #1378

Log the initial line of authentication and login errors at warn level #1378

Conversation

micahlee commented Feb 28, 2020 • edited Loading

Pending

What does this PR do?

Any background context you want to provide?

What ticket does this PR close?

Has the Version and Changelog been updated?

micahlee commented Feb 28, 2020 • edited Loading

micahlee commented Feb 28, 2020

codeclimate bot commented Feb 28, 2020

izgeri commented Feb 28, 2020

orenbm commented Mar 3, 2020 • edited Loading

orenbm commented Mar 10, 2020

andytinkham commented Mar 31, 2020

orenbm commented Mar 31, 2020

micahlee commented Mar 31, 2020

Log the initial line of authentication and login errors at `warn` level #1378

Log the initial line of authentication and login errors at `warn` level #1378

micahlee commented Feb 28, 2020 •

edited

Loading

micahlee commented Feb 28, 2020 •

edited

Loading

orenbm commented Mar 3, 2020 •

edited

Loading