Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full user/login management #260

Closed
begriffs opened this issue Aug 10, 2015 · 74 comments
Closed

Full user/login management #260

begriffs opened this issue Aug 10, 2015 · 74 comments

Comments

@begriffs
Copy link
Member

Also we should probably think through all the things we want to be able to do for auth. Get a nice thorough list up front and then make issues for each item. The auth implemented thus far was just a proof of concept and needs to be improved for real use.

Here's what I'm imagining so far:

Please reply with your ideas.

@ireverent
Copy link

We actually moved forward with Auth. Solved some of the problems using proxy and special authentication rules on /postgrest/ urls. We currently have an admin user on Postgres and a trigger on insert into postgrest.auth that prevents that admin user from being a role in the auth table (this is to prevent escalation by granting a postgrest.auth user admin. Instead we create roles in the database and then manually grant them the admin role. We then do checks against this to limit access and control triggers when performing backup and restore functionality. We currently are trying to figure out a good change password solution. Currently I had to add Put on /postgrest/users/ . I am going to work on it some more to allow better update before trying to merge it back into this project.

@diogob
Copy link
Contributor

diogob commented Sep 1, 2015

I'm using the auth system to authenticate users from an already existing system. The users are in a table used by my application and I just sync PostgREST users with triggers (including password changes).

But I like the feature set proposed, it would give a good solution for people starting user management from scratch.

@ireverent
Copy link

On the password reset and Email, I think that this is something that can be handled by a table specified for email and some kind of trigger to us SMTP to send emails (currently do this with plperl in ours for notification emails). As for some kind of token to calculate for a password reset, do you think we might be able to do some kind of hash for the user information? Then, whatever consumes it just sends in the hash and an updated password anonymously. The only problem with this is there would be no time limit on password reset. The other option would be to make another postgres schema table for password manipulation with a flag in the startup script to handle timeout. This could also let you make a onetime hash that can be check against for password reset.

@diogob
Copy link
Contributor

diogob commented Sep 4, 2015

The problem with email handling is that the lines between an API server and an application server start to blur, and we open a door for an entire new universe of bugs and QOS issues. Maybe this is better left to an application layer and keep only the functionality of resetting the password in PostgREST.

I could say the same about Enforce password reset on next login, but in this case is not a clear cut decision.

@ireverent
Copy link

Makes sense. Should we just create a API endpoint under the postgrest schema that allows one to initiate a password request, store the request and some kind of token in a table, then returns the token. Then the application handles parsing out to an email and handling the return to then hit the endpoint and update using the token we generated?

@ssoriche
Copy link

ssoriche commented Sep 4, 2015

I like this idea and it's almost what I'm doing now with verifying users email when they POST to /postgrest/auth.

@diogob
Copy link
Contributor

diogob commented Sep 4, 2015

👍

@jargv
Copy link

jargv commented Sep 13, 2015

I understand that user management is still a work in progress, but I was surprised to learn that I can't do the following:

POST /postgrest/users
{
"id": "jdoe",
"pass": "supersecret"
}

In other words I left rolname out of the data. My plan was to grant the anonymous user insert privileges on postgrest.auth on the id and pass columns only. The default for the rolname column would then be the regular 'user' role and I'd have open enrollment. The 'admin' role for my app would then be granted update on the rolname, of course, to further manage roles. This seems much more straightforward than using triggers to do the same.

The problem is that postgrest appears to be trying to insert into all 3 columns regardless of the json document not having the rolname column. This fails because the anonymous user doesn't have insert privileges on the rolname column.

@jargv
Copy link

jargv commented Sep 13, 2015

I'd like to present a different approach to this that I think would be much simpler.

Instead of having a separate api for auth, I would like to be able to tell postgREST (via a command-line flag) which table it should use as the auth table: postgrest --auth-table users ... Other than postgREST using that table to authenticate and map users to roles, it would be no different than any other table to postgrest. Most of the features listed above could be implemented like all other permissions.

For example, If you want open enrollment, you grant insert on your auth table to to the anonymous role. Of course you don't want users to be able to sign up as admins, so you don't grant insert on the rolname column. In this case postgreSQL will use the default for that column so all enrollments can begin as your default 'user'. Admins could then be granted update privileges on the rolname column, to change users' roles as needed.

The clear downside is that if you don't secure your auth table correctly you've got a rest api right to your users passwords! So postgREST should refused to serve the 'pass' column of whichever table is being used as the auth table.

@diogob
Copy link
Contributor

diogob commented Sep 13, 2015

👍 for using standard postgrest calls to manipulate users table.
I like the method mentioned in the PR #281, very close to the one above but without the need for a command line switch.

@ireverent
Copy link

Ok. So then does escalation of permissions have to happen on the postgres side? For example. If I wanted to make a admin login to do certain tasks would I just create a normal users (or one of many predefined ones) and then just escalate it up myself through psql and then just used triggers on the auth table to prevent anyone from inserting said role with anonymous?

@jargv
Copy link

jargv commented Sep 13, 2015

@ireverent That's a good point: you'd have to bootstrap a system using a different mechanism other than postgREST. You have to have a "first" admin user to escalate the others. I've seen it done in migration scripts: if there are zero users in the users table, insert a default first admin with known username and password. I'd love to hear of other good approaches, if you know of any.

@diogob I like that method better, as well. I hadn't seen it. Either way, I think treating the auth table the same as any other gets lots of auth features for free.

@ireverent
Copy link

We could just allow the bootstrapping of the first user to be on startup of the postgrest script. Basically have a auth-admin and auth-passeord flag on start up so that we sent having to deal with a default password or having to insert a hashed password string through psql

@begriffs
Copy link
Member Author

These ideas about bootstrapping a user and selectively exposing the auth table have given me an idea. Say we make a view like this:

create or replace view "1".auth as
select pgrst_auth.id as id, pgrst_auth.rolname as rolname, '***'::text as pass
from postgrest.auth as pgrst_auth,
     (SELECT rolname
        FROM pg_authid
       WHERE pg_has_role(current_user, oid, 'member')
     ) as member_of
where pgrst_auth.rolname = member_of.rolname;

Then we remove external access to postgrest.auth and expose this view instead. A GET request to this view shows only those roles that the current user is able to set role into. A basic low privilege user would see only themselves. An admin with sufficient privileges would see everybody.

Next define INSTEAD OF triggers for insert and update. Set the pass field to crypt(new.pass, gen_salt('bf')) which will use bcrypt. Also check that the role chosen is one that the current user is able to set role into so that someone cannot create new accounts to escalate their privileges.

This combined with a command line option to set credentials for a bootstrapped first user may give us all the capabilities we need for basic user creation and updating without special-case postgrest auth endpoint code.

One drawback is that if an attacker compromises the db server they could enable sql logging and see passwords in plain text as people log in. Not sure if there is a way around this since the initial INSERT INTO that postgrest would send would have the plaintext password in it rather than an already bcrypted string like it is now.

@ireverent
Copy link

@begriffs Isn't the same true for the PostgREST server too? So, for example, if I compromise the server, set up some kind of network sniffing log, don't I get all the passwords unencrypted since PostgREST doesn't support ssl directly?

@diogob
Copy link
Contributor

diogob commented Sep 14, 2015

@begriffs this drawback does not seem to be a big deal. If an attacker can enable command logging it already compromised a superuser. It just further focus security around the PostgreSQL mechanisms (for people using a managed database server this should not be a problem).

@ireverent PostgREST supports SSL since it uses libpq (through hasql). I've used to connect directly to RDS instances. Also if the server runs in the same physical server people can use unix sockets which has a very simple security model (file based). But in the case of unencrypted TCP connection this is a new vulnerability.

@ireverent
Copy link

@diogob Does it support https web traffic at the end points? I wasn't referring to the postgres connections as postgrest does support ssl but the tcp traffic coming into the postgREST server.

@diogob
Copy link
Contributor

diogob commented Sep 14, 2015

@ireverent it accepts HTTPS on the endpoints as well

@ireverent
Copy link

Hmm how the heck did I miss that. OK

@diogob
Copy link
Contributor

diogob commented Sep 14, 2015

My bad @ireverent , I'm using the secure connection behind heroku nginx. So the https connection directly would not work. This would open another vulnerability.

@ireverent
Copy link

@diogob Np. We are doing the same thing with our web servers. I think it still brings up the issue that http/https connections between the proxy and postgrest are still sniffable. As long as this is true I don't think that escalations to create insert logging on the postgres server is our biggest point of worry.

@begriffs
Copy link
Member Author

Postgrest used to support SSL actually: 2f584be

If I recall correctly it was troublesome trying to deploy to Heroku because I would have to purchase a certificate for use on a heroku subdomain. So I added an option to disable SSL (to let Heroku's proxy do it). Then I finally removed the SSL entirely because of the steps to accept the cert locally to do local testing. http://www.robpeck.com/2010/10/google-chrome-mac-os-x-and-self-signed-ssl-certificates/

Maybe the best option would be to make the --secure flag do real ssl, but make that flag optional. Also we could move this conversation to a new github issue.

@ireverent
Copy link

Awesome. I would like to see PostgREST with native ssl support. I do also agree that this should be moved to a different issue as it has strayed from the original user management topic.

@ruslantalpa
Copy link
Contributor

I am not sure i fully understood @begriffs post above but here are my ideas (related to my specific case when you have companies and users/employees in them)
If postgrREST could use any table for auth (table name and field names provided in configs) then i would imagine something like this
(table) all_users (id, username, pass, company_id, role) used for auth only, no one has access to it besides postgrest
(table) all_companies (id, name) no access
(view) companies (select * from all_companies where id = ) (select only)
(view) users (select * all_users where id = ) (select/insert/update/delete for admin)
(view) signup (select * from all_users) only insert priv and the view has a trigger to first insert the company then use that id to set the company_id field

PS1: this is probably possible right now by having triggers that sync all_users table and pg_auth table
PS2: @begriffs maybe what's missing in your case is a "signup view" that would eliminate the need for the "bootstrapped first user"

@ruslantalpa
Copy link
Contributor

My 2 cents on ssl.
SSL (and gzip) can be done using the reverse proxy (nginx). In a live setup i would place nginx and postgrest on the same box, so no sniffing (the whole server have to be compromised).
Event if postgrest and the proxy are not on the same box, they would still be colocated and the only one with the ability to sniff the traffic would be the provider (amzon ... nsa :)).
So i would not worry if the traffic between the proxy and postgrest is unencrypted (but again, the best way is to have them on the same server).
The downside of having ssl support is that it adds a huge codebase (even if it's only a few lines to implement it) that makes postgrest bigger and slower.
I would vote for implementing features that can't be implemented before postgrest (nginx) or after (postgresql).

Maybe i am not seeing all the angles so in this case if ssl is implemented some day, it should be possible to disable it at compile time so no ssl code/lib is linked.

@ireverent
Copy link

So let say that we wanted to spin up multiple PostgREST instances to handle more traffic and have them behind a load balancer. Would nginex have to be spun up on each instance of PostgREST?

@ruslantalpa
Copy link
Contributor

Yes,
If you have LB->nginx->postgrest, at each stage you add network overhead/traffic. Having them on the same box eliminates that and nginx cpu usage would not be much. I would go one step further and add on the same box a replicated sql node from which you can only read, but that would be when postgrest can split it's queries into reads and writes and send them to different machines and of course the database in question is not huge and can fit in one node. The possible downside would be that new machines could not be spawn up in seconds but rather minutes because of the SQL sync.

@ruslantalpa
Copy link
Contributor

But the upside is that the database is not such a big bottleneck now :)

@diogob
Copy link
Contributor

diogob commented Oct 5, 2015

👍 @ruslantalpa , if one have sensitive data so that sniffing in the provider may be a problem, setting a nginx + postgrest box should not be a problem.
Regarding the feature of distributing read queries this could also be implemented with external software (like pgpool).

@ruslantalpa
Copy link
Contributor

@diogob didn't know about pgpool, that is very cool.
If you get to a level where amazon sniffing your data is a valid concern then i guess you ether use something build inhouse or a modified version of postgresql (with ssl and everything).
don't get me wrong, i am not agains supporting ssl i am just suggesting that it should be possible to remove it at compile time and more importantly, use the time/resources to develop features that can't be implemented in a stack in any other way besides code in Postgrest.

@begriffs
Copy link
Member Author

begriffs commented Oct 8, 2015

That's right. The dba could create a general user role that gives the right table-level security. New users inherit from this role, but each get their own randomly named role for use in row-level security.

Specifying the inherits key when creating a user would have that effect. A client can also leave inherits blank and specify the rolname key directly.

@ireverent
Copy link

In the github example, would there have to be some storage to link a github login to a role ?

@ruslantalpa
Copy link
Contributor

So here is the scenario i want to model (think simplified basecamp), please check if the described user management/auth holds
There are the following tables

companies (id, name)
users (id, email, type, company_id, password) --type can be admin/employee
project (id, name, company_id)
users_projects(project_id, user_id)

The system needs to provide this functionality

  • signup - creates a new company and a user in that company as an admin (anonymous role)
  • each user when logged in can see only the projects within his company if he is an admin
  • if the user is an employee he needs to see only the projects in his company that he is assigned to
  • and admin can edit a project, a user can not
  • an admin can add users to the system but only with the right company_id (employees can not)

@ruslantalpa
Copy link
Contributor

A relevant conversation about having authorisation done by the db (i.e. each user in your system has a db role)
http://dba.stackexchange.com/questions/25357/choice-of-authentication-approach-for-financial-app-on-postgresql

@ruslantalpa
Copy link
Contributor

I think i still vote for having nothing to do with authentication in postgrest. It's true that it provides a install/execute fully functioning system but i think it will be useful only in the early stages of a project and most of the time it will be abandoned. This is even more obvious for ppl trying to migrate their system, they all have a "users" table.
JWT containing role + custom keys seems like it provides the flexibility necessary for most of the situations.
The part that you cant just download and execute postgrest can be mitigated by providing sample authenticators done at proxy level, or at db level with /rpc

@calebmer
Copy link
Contributor

calebmer commented Oct 9, 2015

@ruslantalpa pretty much every developer will be implementing some form of authentication, for the majority of developers that implementation is just uid-password. In my mind we could fragment the community into using x different approaches to solve the same problem, or we could provide production ready authentication and solve a lot of problems in the first place. Also remember the PostgREST audience, they use our project for its opinions. They don't want to do research on the best method, they trust that to us we should not disappoint.

Going back to my auth idea. It can live transparently with the flexible proposed JWT auth. All the tables in my proposal do is act as a boolean comparison for any client asking for a JWT. And if a developer doesn't want PostgREST auth they don't provide an auth schema. I'll wait a bit to see #311 mature than implement my ideas on top of it as code is always the best discussion point.

@ruslantalpa
Copy link
Contributor

@begriffs if you plan to implement auth in postgrest then this "Run postgrest with a flag to initialize the auth tables" should not be done in postgrest but a bash script + sql file

@begriffs
Copy link
Member Author

Ready for the next installment of a begriffs-wall-of-bulletpoints™? 😄

  • Cool that the stackoverflow discussion uses the same technique as postgrest for separating authentication and authorization
  • As @ireverent pointed out, even external auth (e.g. Github) needs a place to store its mapping of user identifiers to db roles, so the auth table is useful there too
    • This favors including it in addition to the open-ended possibilities of JWT
    • In addition to @calebmer's point about providing a reference implementation to make people's lives easier
  • Rather than boostrapping the auth table in the binary we can distribute a bash/sql script, as suggested.
    • Pro: easier to inspect/modify what it does
    • Con: the binary does rely on specific settings, like the schema and name of auth table for basic auth, which can more easily drift if auth scaffolding exists as separate docs
    • Con: more files needed in postgrest distribution beyond the binary...maybe?
  • Remove the inherits shortcut from the auth table
    • It leaves a residue in the table of what was essentially an action
    • It provides only partial control of role membership, we need more
  • Move control of db roles and memberships to new endpoints
    • One endpoint to create/delete roles, and another to set/update/remove memberships between them

    • That way it is done more thoroughly and is not tied to the auth table

    • The auth table has exactly one function: associating some kind of user id (and optional password) with a real db role

    • The role is the primary key, and also a foreign key to the real role table

    • Thes endpoints can be supported with no changes to the haskell code.

    • Roles:

      -- list which roles are members of which others
      -- but only show roles to which the current user has access
      create or replace view "1".roles as
      select rolname as role
        from pg_roles
      where pg_has_role(current_user, oid, 'member’);
    • Role memberships:

      -- list which roles are members of which others
      -- but only show roles to which the current user has access
      create or replace view "1".role_memberships as
      select a.rolname as role, b.rolname as has_role
        from pg_roles as a, pg_roles as b
      where pg_has_role(a.rolname,    b.oid, 'member')
       and pg_has_role(current_user, a.oid, 'member’)
       and a.rolname <> b.rolname;

      (Although this may need to become a recursive query to show transitive membership, hmm)

    • To update those roles we make an insert/update trigger on "1".role_memberships

      Pseudocode
      
      fail if new row references roles of which the current_user is not a member
      if updating an existing row
        revoke previous membership
      grant membership between NEW roles
      
    • Also create a trigger for deletion

  • We could establish a convention that NULL password means the auth table user is not allowed to log in with the built in basic auth. Useful for when the auth table is used for external auth bookkeeping
  • A disadvantage of splitting the endpoints into auth, roles and role_memberships is that creating a new login requires three http calls
    • POST /roles to get a new actual db role
    • POST /role_membership to make it inherit from a more general role like web_user or whatever one you use
    • POST /auth with the role and desired username+pw
  • Earlier (and as @sthzg mentioned in chat) we thought perhaps having a real db role per user would tie up the db connection pool and stop the server from scaling.
    • Actually I don’t think using a real db role per user makes things any worse.
    • Right now each connection to the server grabs a db pool connection. Connections aren’t shared between connections even when the connections are operating with the same role permissions.
    • I’m not sure what it would mean to share a connection, whether it means interleaving statements? Sounds hard. As it is each request is quick and stateless.
  • @ruslantalpa's auth scenario is superb
    • Can we think of a way to implement it using only a tree of roles and row-level security?
    • Postgres 9.5 beta 1 was released yesterday! We can try row level security for real now.
    • We would use table policies
    • We may want an employees table to store which user belongs to which company (referencing actual db user roles) to avoid the user_vars discussed in chat.

@ireverent
Copy link

I think that we can reduce one of the rest calls for a new user by just having any new with userid create a role at that time with triggers. Then you can use membership to escelate privileged if needed. You keep the end point for role and role membership for admin reasons.

@begriffs
Copy link
Member Author

Good idea.

@ruslantalpa
Copy link
Contributor

Exposing role creation and management to rest seems like a very bad idea, lots of ways this can go wrong. Triggers should hardcode user role creation and membership

@ruslantalpa
Copy link
Contributor

Remember that roles are a global entity in postgesql and an admin of one company should not see/manage roles for another company and i dont think this restriction can be accomplished

@calebmer
Copy link
Contributor

-1 to one role per user.

You thought that begriffs-wall-of-bulletpoints™ was exciting, well prepare for calebmers-long-reply-of-doom (jury is out for the ™ on that one 😉)

I haven't been convinced that one role per user is a good idea, and if it were implemented I might stop using PostgREST (that's extreme, maybe just do my own auth implementation). Here's why: as @ruslantalpa mentioned, (1) roles are a global construct. This gets especially weird on a local machine when you want to replicate multiple production environments. That's a huge footprint for an app which otherwise has no footprint at all, I've chosen PostgREST in the past because it can be invisible. (2) I highly doubt the role system was ever designed to support 10,000+ users. Let's say PostgREST is powering Facebook (picked Facebook at random, not cuz #310 ☺️) would Postgres even be able to handle that many roles? Imagine all the touch points of roles/grants in a single SQL select query, check for usage rights, check for select rights, check for row level powers, oh yeah and check the entire inheritance tree for these things too. (3) the user (in theory?) can log into Postgres and run queries. This fear might be unsubstantiated (and with NOLOGIN maybe irrelevant), but I feel it should at least be mentioned.

Here is what I need to be convinced one role per user is a good idea:

  • Show that roles don't have to be global (preferably even local to a schema)
  • Provide a production example where this strategy operates successfully

We should be able to get all the benefits of a role tree without creating a new role for every user which has all sorts of unforeseen consequences.

@ireverent
Copy link

@calebmer Great well laid out post. I have some counter points though in the order that they were raised. The global nature of roles, I am not sure that it should be up to postgREST to be built with isolation of databases from eachother. In a production environment, were multiple DBs are found you run into similar issues in name spacing roles. PostgREST concerns should be with models of production environments. Isolation should be a concern of a dev/prod environment/stack (insert docker FTW) Next, as for the role system support, since roles and such constructs in Postgres are handle through table references I do not share the concern that 10,000 + roles in pg_authid would be an issue no more than I would that 10,000 records in any other table with a number of foreign key constraints. I think the main performance bottleneck would be in role creation by the thousands (actual facebook scale) but that would require quite a different solution. The third point you answered yourself. NOLOGIN should be used with all these roles that we are creating even if we do map multiples users to a role or if a role per users.

As for you requirements for roles as a good Idea.
We can't show you roles don't have global reach because they do. That is very much a function of roles in postgres. I can tell you it can be mitigated two ways, one, GRANTs, two isolation through vm or docker.

I agree that we should mock this up, I think that this will allow for the performance testing that we want so that there are little to no assumptions on this front.

I think that since roles are well documented and expressive of what is actually happening, it should easy to rationalize those consequences. Thanks again and please let me know where/if I step in it.

@ruslantalpa
Copy link
Contributor

This whole discussion about role per user or not becomes mute if postgrest chooses not to do auth at all.
The main job for postgrest is to turn /endpoint into SELECT * FROM ... and to provide flexibility in that area everything else is inessential (personal opinion). At most auth should be a compile time plugin/addon (but i would favour it being a sample vagrant based project to showcase how nginx/postgres/postgresql can be put together to provide a fully featured solution with auth and all).

@begriffs
Copy link
Member Author

I have to admit it sounds attractive to leave the complexity out. What I don't yet see is the full plan of how to set up basic auth through nginx and have it create the jwt for postgrest.

My long comments above are my best guess about how to do things with the built in table method, can you give a more detailed outline of your nginx idea?

@ruslantalpa
Copy link
Contributor

There isn't even much to it, it's very basic.
Nginx is alway have to be in front of P. so it can be used to hardcode a special endpoint like /login which is handles by a 10 line php/lua/whatever script that basically receives user/pass, checks it against a table (that is specific just for that example deployment) and decides on a db role (be that one per user or not), encodes it as jwt and responds with that. Postgrest handles all other requests (that are being proxied) and executes queries under that db role inside jwt, it does not care about users table and it does not care if that role is per user or not.

Another sample implementation could be where login is done on the db side, i.e. the db structure exposes an /rpc/login endpoint which can be accessed with the anonymous role, you send there user/pass, and the same as the script in nginx, checks that, decide on a role, encodes it and sends it back.

The main thing here is that these are just sample deployments/suggestions that are not baked in the binary itself and you can have a bunch of them without committing to one specific path.

You could also have one example where the login is done through a 3rd party like fb/ggl/gh.

@calebmer
Copy link
Contributor

@ruslantalpa I think I've come around to agree with you that adding auth as a compile step (and pretty much only #311 in core for auth) is a good idea. However, we still should provide some auth.

So what does everyone think about distributing two binaries? One with a full auth solution, and another with just JWT support.

Doing this also allows us to build a super awesome, opinionated, best practice based auth system as we don't have to worry about extensibility.

@begriffs
Copy link
Member Author

Good old issue 260...

We've reached the conclusion to stick with JWT and drop the rest. Closing.

@jargv
Copy link

jargv commented Nov 6, 2015

@begriffs, does that mean postgREST will only consume jwt tokens and not produce them?

@begriffs
Copy link
Member Author

begriffs commented Nov 6, 2015

It does produce them actually. If a sql stored procedure is declared to return a datatype of jwt_claims then postgrest will turn the result into jwt. Here is an example in a pull request I'm working on right now: https://github.com/begriffs/postgrest/pull/347/files#diff-83e3865a7f0b381a48506db2102cad4bR187

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

7 participants