Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Digest tokens and cache #344

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

philayres
Copy link

This PR relates to the discussion in issue #300 regarding the storage of plain text authentication tokens in the database.

The default operation of the gem has not been altered. Some configuration has been added to allow the developer to select to use digest rather than plain persistence of authentication tokens. The digest tokens are generated with the default Devise password hasher (BCrypt by default).

The challenge was not the generation or comparison of the digest tokens, but preventing a horrible slow-down. BCrypt is computationally expensive by design, to prevent somebody being able to brute-force the digests. For a stateless API this is no good, since there is no session and instead there is a reauthentication on each request. For this I implemented a cache mechanism, allowing an existing in-memory cache to be used to hold the previous authentication by a user.

I think that the update I have made to the README explains things best, so I'll quote here:

Persisting Tokens

The configuration allows for tokens to be stored as either plain text or as a
digest generated by the default Devise password hashing function (typically BCrypt). This configuration is set with the item config.persist_token_as.

Plain Text

If :plain is set, the authentication_token field will hold the generated
authentication token in plain text. This is the default, and was in fact the only
option before version 1.16.0.

In plain text mode tokens are checked for uniqueness when generated, and if a token
is found not to be unique it is regenerated.

The record attribute authentication_token returns the stored value, which
continues to be plain text.

Digest

If :digest is set, the authentication_token field will hold the digest of the
generated authentication token, along with a randomly generated salt. This has the
benefit of preventing tokens being exposed if the database or a backup is
compromised, or a DB role views the users table.

In digest mode, authentication tokens can not be realistically checked for
uniqueness, so the generation of unique tokens is not guaranteed,
even if it is highly likely.

The record attribute authentication_token returns the stored value, the digest.
In order to access the plain text token when it is initially
generated, instead read the attribute plain_authentication_token. This plain
text version is only retained in the instance after authentication_token is set,
therefore should be communicated to the user for future use immediately. Tokens
can not be recreated from the digest and are not persisted in the datatabase.

Caching Authentications with Stored Digest Tokens

BCrypt hashing is computationally expensive by design. If the configuration uses
config.sign_in_token = false then the initial sign in is performed once per
session and there will be a delay only on the initial authentication. If instead
the configuration uses config.sign_in_token = true then the email and
authentication token will be required for every request. This will lead to a slow
response on every request, since the token must be hashed every time.
For API use this is likely to lead to poor performance.

To avoid the penalty of rehashing on every request, cache_provider and
cache_connection options enable caching using an existing in-memory cache
(such as memcached). The approach is to cache the user id, the authentication token
(as an SHA2 digest), and the authentication status. On a
subsequent request, the cache is checked to see if the authentication has already
happened successfully. If the token is regenerated, the cached value is
invalidated. Comments in the file lib/simple_token_authentication/cache.rb provide
additional detail.

The rspec example in spec/lib/simple_token_authentication/test_caching_spec.rb
tests the speed of the cache versus uncached authentication shows the speed up.
When using a BCrypt hashing cost of 13 (set by Devise.stretches), the speed up
between using the ActiveSupport MemoryStore cache against not caching is greater than
2000 times.

It should be noted that hashing uses the same Devise defaults as for entity
passwords (including hashing cost and the Devise secret). Currently there is no
way to configure this differently for passwords and authentication tokens.

Overall it looks like there are lot of changes, but the reality is that most of this new stuff is around allowing the configuration of the cache in a clean way, plus some tests to show that it doesn't break things. To be honest, it was around the specs that I struggled most in trying to see if there was anywhere that the authentication is actually tested as a black box that I could extend. I may well have not followed your existing patterns perfectly here, but the tests do pass, and seem to have good coverage.

The other thing that I had to work around was the configuration. In a Rails app I used for testing the initializer file for configuration kept skipping the configurations for the ORM adapters, a model I attempted to follow with the cache provider. I have added a small callback after the configuration has been evaluated to attach the cache provider. Since I don't see how the Adapters get configured in the current setup, this may be an area worth discussing.

Anyway, I believe that this is a decent set of changes that meets the requirements of the discussion in the original issue. I considered separating out the changes into two PRs, one for digest persistence, and one for cache, but since they are so closely related in terms of needs it didn't make sense.

BTW I took the liberty of bumping the version (used in the gemspec) to 1.16.0, to ensure any builds reflect these changes. You should feel fee to own the actual numbering. Maybe this is worth a 2.0?

I'm having to spend some time getting this incorporated into my production systems this week (with tons of extra app testing) so I'm hoping that we can keep any changes to this PR well focused. I appreciate any feedback though in making it better and more secure!

@philayres
Copy link
Author

It looks like the Travis checks are failing due to incompatibility between the test gems and a very old version of Ruby being specified. Probably worth bring Travis version of Ruby up to something current and supported.

@philayres
Copy link
Author

@gonzalo-bulnes I have updated this PR to merge your recent changes. Is there any chance you will consider incorporating this? I believe this would be helpful for other security concious users of this gem. Equally, I understand if you don't have time for doing a lot with this gem. Is there anything I can do to help?

@connorshea
Copy link

I would definitely appreciate this functionality as well.

@davidalee
Copy link

@gonzalo-bulnes Any chance we can get this reviewed and merged? This PR addresses a pretty serious security vulnerability.

@davidalee
Copy link

@philayres Have you been using your branch (digests/hashing tokens) in production? I'm considering using what you've written here and am wondering if there were any gotchas or any yellow/red flags to consider prior to using this branch.

Thanks for the work in putting together this PR!

@philayres
Copy link
Author

@davidalee sorry for the delay in responding. I have only used this in one production project (and not very widely). It doesn't appear to have caused any issues with regular logins, and the tokens seem to work fine. The caveat on this is that I only have a very limited number of API users, and have not tested with thousands of users. It is possible that there is a performance issue in doing so that I haven't tested.

Feedback would be much appreciated!

@philayres
Copy link
Author

@gonzalo-bulnes - I see you have a lot of outstanding PRs in this project. I understand you are busy (aren't we all!) Any chance you could let us know if you plan to incorporate any of these contributions, or whether we should all just keep on forking and making our own changes?

@gonzalo-bulnes
Copy link
Owner

gonzalo-bulnes commented Apr 21, 2020

Hi @philayres, first of all, thank you for your persistence and patience.

I've been thinking quite a bit about what to do here and reconsidering the discussion here about storing or not authentication tokens in plain text. The conclusion I've reached is that at this time I'm not ready to accept a feature addition of this size and complexity.

There are a few things in play here:

  • User pool size: I think your use case is valid and I trust the implementation supports it. That comes, of course, at the cost of additional complexity for users and maintainers (as of today, me). On the user side, while the question of hashing tokens has come up a few times, I have the feeling most projects using Simple Token Authentication don't require it.

  • Maintenance & support: I am not familiar enough with any caching strategy to evaluate the choices that you make and even less to maintain or support this additional code effectively (support is a bigger thing than it may look like).

  • Scope: I have the feeling that this feature goes beyond what I understand by simple token authentication. (This may be related to the user pool size considerations above.)

Putting that aside for a moment, the ideal scenario in my mind is that as many people as possible find support in their use cases, whatever those can be, and I am convinced that the free software model provides means to achieve that if we consider it collaboratively.

I hear sometimes of forking in demeaning terms, I do not subscribe to that thinking. This code is published under a free software license with the intention of allowing it to be forked.

Now, I also believe that there is more to collaboration than throwing code over the fence. That's why, if you are interested, I am open to thinking of ways of promoting your fork to people who are looking for hashed tokens. That could:

  • give more people an opportunity to find software that fits their use case
  • allow you to get more feedback from different use cases
  • give me a chance to see all of the above (and maybe validate or review the trade-off I laid out as well) 🙂

Getting into solution-space, I think that could take the shape of a "Similar projects" section in the README, or a pinned issue that people could add their forks / other gems to. (There are a few other projects addressing different uses of token authentication these days.)

What do you think?

Copy link
Owner

@gonzalo-bulnes gonzalo-bulnes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little additional note, that I had been sitting on for some time now 🙂

Comment on lines +320 to +324
BCrypt hashing is computationally expensive by design. If the configuration uses
`config.sign_in_token = false` then the initial sign in is performed once per
session and there will be a delay only on the initial authentication. If instead
the configuration uses `config.sign_in_token = true` then the email and
authentication token will be required for every request. This will lead to a slow
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the configuration uses config.sign_in_token = false then the initial sign in is performed once [...]

Isn't it the other way around? I don't see other mentions of sign_in_token in this change set, so I assume it may be a typo? It may only be matter of flipping the condition around.

Naming is hard, and that's true for config options as well, but the intent in the current implementation is to be read as "if the token is used as a sign in token" (config.sign_in_token = true) then sign in happens, which translates in a session being persisted. And if not (config.sign_in_token = false), then there is no sign in and credentials must be provided with every request. (corresponding tests)

@gonzalo-bulnes gonzalo-bulnes added the new feature This pull request adds a new feature. label Jan 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature This pull request adds a new feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants