-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: reduce (eliminate?) support for sessions without a current database set #23145
Comments
Personally coming from a mysql background (where having no current database is perfectly normal), I find the postgres behavior annoying. In postgres, creating a database is a non-sql command, so there's no question of how do you connect to the database to create your first database. We have |
Of the various options listed above, none include the situation "refuse the user a connection because the db in the conn string doesn't exist yet". I'd personally like that the user always finds a database already available for them if they don't really know what they're doing (yet), this will lower the onramp. |
Understood, but the fact that we allow this is something that makes CRDB unusual. Other DBs only allow you to connect to databases that already exist. I happen to like this, but "connect to a non-existent database and create it as your first action" feels like a non-intuitive clever trick rather than something we have to ask everyone to do when they're first starting out. |
Ok, so don't you also support the idea to make a default db exist automatically? |
Of the proposals here, I think that a |
I'd say that no users should have DROP permission on it. If root wants to disallow CREATE after other databases have been set up, that would be kosher with me. |
24735: sql: sanitize the "no current db" story r=knz a=knz Fixes #23893. Fixes #23145. Updates #24598. Informs #24056. Informs #23958. Prior to this patch, CockroachdB made it excessively simple for client connections to not have any current database set. Also by design CockroachDB allows a connection to be opened with a non-existent database set as current. This in turn causes various problems related to name resolution and the `pg_catalog` schema. To understand these problems we can consider separately two groups of users, which have separate use cases: - newcomers to CockroachDB that don't have prior experience with SQL or PostgreSQL. These will not know about "databases" in any sense, and will use CockroachDB with default options. This includes using `cockroach sql` with no current database selected. These users have to be taught upfront that they need to "create a database" and "set the current database", which they may or may not (more likely not) be interested in. When these users then transition into writing apps, they will copy-paste the conn URL printed by `cockroach start` (which sets no current db) into some ORM or framework, and then will regretfully observe that their app fail in mysterious ways because the current db is not set (also see the next point). - users of existing applications or SQL drivers that have been developed to target PostgreSQL. These apps commonly contain code that: - connect to a database named "`postgres`" by default, and make this default hard(er) to customize. For example Ecto (Elixir ORM) doesn't make this configurable. (From the PostgreSQL docs: "After initialization, a database cluster will contain a database named `postgres`, which is meant as a default database for use by utilities, users and third party applications. The database server itself does not require the postgres database to exist, but many external utility programs assume it exists.") - (regardless of which db is selected as current), will issue some introspection queries using `pg_catalog` before any additional initialization is made by the higher-level app code, in particular before the app can issue `CREATE DATABASE`. Currently `pg_catalog` queries (and several other things) fail if the current database is unset or set to a non-existing database. To address this relatively large class of problems, this patch modifies CockroachDB as follows: - all clusters (existing or new) now get two empty databases on startup (created by a migration) called `defaultdb` and `postgres`. - when a client doesn't specify a current db in their conn string (i.e. in the pgwire options), the `defaultdb` database is picked up instead. This resolves all the known problems around the situations identified above. The two new databases behave like any regular database and are provided for convenience to newcomers and compatibility. (Administrators are free to drop them afterwards to restore the previous situation, if that is actively desired.) In addition, to make it slightly harder for newcomers to shoot themselves in the foot, the `database` session variable cannot be set to the empty string any more if `sql_safe_updates` is otherwise set. Three alternatives to this approach were considered, discussed with @jordanlewis and @nvanbenschoten in their quality of pg compatibility experts, and with @bdarnell in his quality of carer for newcomers, and ultimately rejected: A) generate the current db upon connection if it does not exist, make connections using the empty string as curdb use `system` instead. Rejected because we don't like to auto-create upon connection. Too easy to flood the system with bogus database names by typos in the connection string. B) Pre-populate clusters with a special anonymous database (db name = empty string). Rejected because there are too many areas in CockroachDB with a risk to fail silently or with weird error messages as a result. Also does not solve the problem of clients that want a db called `postgres`. C) Auto-create a database named after the user upon user creation (with an optional flag to skip db creation). Rejected because it does not solve the problem of clients that want a db called `postgres`. @bdarnell: creating a database per user also implicitly encourages bad habits (like sharing certs for the user whose name corresponds to the database instead of creating multiple users). D) Implement some magic in the handling of `pg_catalog`, name resolution and all other pieces that currently don't like non-existent `database` values to create the appearance of something that works. Rejected because too complex to implement successfully (there are many moving parts that need to be touched to make this work) and the resulting magic would be hard to understand by CockroachDB contributors and hard to maintain over time. E) Auto-create only one database called `defaultdb`. Rejected because it does not solve the problem of clients that want a db called `postgres`. F) Auto-create only one database called `postgres` and use that as default when no current db is specified. Rejected because not good for branding. G) Name the `defaultdb` either `def`, `default` or `default_database`. The word "`default`" was rejected because it is a SQL keyword and would cause queries using it to fail in obscure ways. The word "`def`" (similar to what MySQL uses in `information_schema`) was rejected because it refers too strongly to other uses of this word in programming languages ("define"). "`default_database`" was rejected because too verbose. Release note (sql change): new and existing clusters will now contain two empty databases called `defaultdb` and `postgres` by default. The database `defaultdb` is automatically used for clients that connect without a current database set (e.g. without a database component in the connection URL). The database `postgres` is provided for compatibility with PostgreSQL client frameworks that require it to exist when the database server has ben freshly installed. Both new databases behave like any other regular database. Release note (general change): existing clusters upgraded to this version of CockroachDB will automatically see two new empty databases named `defaultdb` and `postgres`. These were added for compatibility with PostgreSQL and to ease adoption of CockroachDB. They can be manually deleted by an administrator if the client applications are determined not to require them. Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
Discussed with @nvanbenschoten and @jordanlewis while working on #21456 and #23045.
pg does not support client sessions without a current database set. The virtual schemas (especially pg_catalog) are not well-defined if there is no current database.
Also in their current implementation they can generate a large amount data if queried by a client without a current database set, putting pressure on the SQL allocation pool.
The discussion on #21456 suggests evolving CockroachDB so that client sessions more often find themselves with a current database set by default. There's a spectrum of approaches:
system
if not specified in the conn string / pgwire optionsdefault
) and encourage the user viacockroach sql
to create the database if it doesn't exist yetcc @bdarnell
The text was updated successfully, but these errors were encountered: