Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log the initial line of authentication and login errors at warn level #1378

Closed
wants to merge 1 commit into from

Conversation

micahlee
Copy link
Contributor

@micahlee micahlee commented Feb 28, 2020

Pending

  • Security Review Sign-off

What does this PR do?

The generic 401 error response from authenticators (specifically authn-k8s) can have a variety of root causes, some of them not directly related to the authentication credentials themselves, but rather issues with the authenticator configuration.

This change improves the ability for a Conjur operator to quickly pinpoint the cause for an authentication failure, without changing log levels and restarting the Conjur process/container.

Any background context you want to provide?

This is a follow-up issue to #1219.

What ticket does this PR close?

Connected to #1377

Has the Version and Changelog been updated?

Yes

The generic 401 error response from authenticators (specifically authn-k8s) can have a variety of root causes, some of them not directly related to the authentication credentials themselves, but rather issues with the authenticator configuration.

This change improves the ability for a Conjur operator to quickly pinpoint the cause for an authentication failure, without changing log levels and restarting the Conjur process/container.
@micahlee
Copy link
Contributor Author

micahlee commented Feb 28, 2020

Hey @andytinkham and @shaharglazner,

I've created this draft PR to continue the discussion of raising the log level for the top line message from authentication failures (401) from debug to warn. This was initially included in a previous PR, but removed from that PR because we decided it was worth thinking more about any security implications of this change (See this comment thread for more context).

When you're able, would you please advise on the best way to continue for a security review, and what information you would like to continue progressing this forward?

Thanks so much!

CC: @orenbm @jtuttle @sgnn7 @alexkalish @jvanderhoof

@micahlee
Copy link
Contributor Author

My initial observations on the subject:

  • The particular content for log messages comes from the Authenticator implementations.

  • The destination for these log messages is the normal Conjur server log (or DAP node log), typically accessed with docker logs <container_name>.

@codeclimate
Copy link

codeclimate bot commented Feb 28, 2020

Code Climate has analyzed commit b15d2c3 and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 100.0% (50% is the threshold).

This pull request will bring the total coverage in the repository to 88.5% (0.0% change).

View more on Code Climate.

@micahlee micahlee self-assigned this Feb 28, 2020
@izgeri
Copy link
Contributor

izgeri commented Feb 28, 2020

@micahlee can you provide some examples of the kinds of log messages you might see once this change is merged in? for example, in the authenticator troubleshooting we have identified a few really common problems - I'm curious whether there would be additional log output visible when people encounter these problems, which might make troubleshooting them (without changing the default log level to debug) easier

@orenbm
Copy link
Member

orenbm commented Mar 3, 2020

The particular content for log messages comes from the Authenticator implementations.

@micahlee I'm afraid that's not entirely correct. In the JWT decoder used in authn-oidc and authn-azure we have the following errors:

        TokenDecodeFailed = ::Util::TrackableErrorClass.new(
          msg: "Token decode failed (3rdPartyError ='{0}')",
          code: "CONJ00035E"
        )

        TokenVerificationFailed = ::Util::TrackableErrorClass.new(
          msg: "Token verification failed (3rdPartyError ='{0}')",
          code: "CONJ00015E"
        )

So we also log error messages that come from outside of our code.

@orenbm
Copy link
Member

orenbm commented Mar 10, 2020

@micahlee I'm afraid that's not entirely correct. In the JWT decoder used in authn-oidc and authn-azure we have the following errors:

    TokenDecodeFailed = ::Util::TrackableErrorClass.new(
      msg: "Token decode failed (3rdPartyError ='{0}')",
      code: "CONJ00035E"
    )

    TokenVerificationFailed = ::Util::TrackableErrorClass.new(
      msg: "Token verification failed (3rdPartyError ='{0}')",
      code: "CONJ00015E"
    )

So we also log error messages that come from outside of our code.

However, I looked into the errors that are raised by the 3rd party and it doesn't include any sensitive data.

@micahlee @shaharglazner ^

@andytinkham
Copy link
Contributor

From a security standpoint, here's what I'd like to see us do for this change. We can make a list of the errors we know could be thrown. It's likely we won't get all of them, but we can get many, including all the ones we control. Review those errors individually and determine the sensitivity of the information in the error. This could be helped by making tests that address our handling of each error (which would be good to have anyhow), particularly if we made all our sensitive information conform to a pattern and then grepped the logs for that pattern after all our test runs.

More broadly, it would be really helpful generally to build out a tiered risk model of all our sensitive info - what do we have that is sensitive? Given a set of bad actor personas (e.g., disgruntled internal attacker who already has some access, external attacker, etc.), what could they do with each piece of sensitive data if they had access to it. I suspect we'd come up with something like secret values are the highest risk, then a tier that has less severe consequences and a tier with even less severe consquences and so on. This would give us a useful risk model that would be reusable any time we were touching data.

@orenbm
Copy link
Member

orenbm commented Mar 31, 2020

thanks @andytinkham !
I think this effort is a good opportunity to get all of the logs and errors that we write into the centralized files for errors and log messages. this will make it easier for us to know what messages, and data, we write to our logs.

@micahlee
Copy link
Contributor Author

Thanks @andytinkham, I'm closing this PR for now until we can address your suggestions under a new issue: #1464

@micahlee micahlee closed this Mar 31, 2020
@micahlee micahlee deleted the 1377-authn-log-level branch March 11, 2021 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants