Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support for "meta" postgresql tables? #5194

Closed
alex opened this issue Mar 12, 2016 · 19 comments
Closed

sql: support for "meta" postgresql tables? #5194

alex opened this issue Mar 12, 2016 · 19 comments
Assignees
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL
Milestone

Comments

@alex
Copy link
Contributor

alex commented Mar 12, 2016

Right now cockroach supports the PostgreSQL wire protocol (which is pretty sweet). On a lark I decided to see what happened if I pointed the django test suite against it. Right now it fails pretty early with:

  File "/Users/alex_gaynor/projects/django/django/db/migrations/recorder.py", line 52, in ensure_schema
    if self.Migration._meta.db_table in self.connection.introspection.table_names(self.connection.cursor()):
  File "/Users/alex_gaynor/projects/django/django/db/backends/base/introspection.py", line 58, in table_names
    return get_names(cursor)
  File "/Users/alex_gaynor/projects/django/django/db/backends/base/introspection.py", line 53, in get_names
    return sorted(ti.name for ti in self.get_table_list(cursor)
  File "/Users/alex_gaynor/projects/django/django/db/backends/postgresql/introspection.py", line 66, in get_table_list
    AND pg_catalog.pg_table_is_visible(c.oid)""")
  File "/Users/alex_gaynor/projects/django/django/db/backends/utils.py", line 64, in execute
    return self.cursor.execute(sql, params)
  File "/Users/alex_gaynor/projects/django/django/db/utils.py", line 94, in __exit__
    six.reraise(dj_exc_type, dj_exc_value, traceback)
  File "/Users/alex_gaynor/projects/django/django/db/backends/utils.py", line 62, in execute
    return self.cursor.execute(sql)
django.db.utils.InternalError: sql/select.go:324: TODO(pmattis): unsupported FROM:  FROM pg_catalog.pg_class AS c LEFT JOIN pg_catalog.pg_namespace AS n ON n.oid = c.relnamespace

I believe the problem is a lack of support for pg_catalog.pg_class. Is the intent to only support PostgreSQL at the wire protocol level, or also the full API surface (notably these tables).

The exact query being run is: https://github.com/django/django/blob/master/django/db/backends/postgresql/introspection.py#L56-L69

@alex
Copy link
Contributor Author

alex commented Mar 12, 2016

Actually, slightly closer inspection of sql/select.go indicates that maybe this is a far simpler parsing issue on the FROM statement.

@bdarnell
Copy link
Contributor

The immediate issue in that stack trace is not that the pg_catalog.pg_class table is missing, but that we don't yet support JOIN.

We are not planning to implement the pg_catalog tables; frameworks that rely on introspection will need adapters to use our own introspection commands. I have some work towards doing this for django in cockroachdb/sqlalchemy-cockroachdb#14, although it's not working yet (blocked on our lack of support for changing the default value of a column with ALTER TABLE, which django's migration system does when setting up a new database).

@alex
Copy link
Contributor Author

alex commented Mar 12, 2016

👍 I suppose this should be closed in favor of #2970 then.

@JackKrupansky
Copy link

Some sort of virtual table mechanism would be nice, so that all of these PG system tables could be implemented virtually via a plugin that in turn used native introspection.

That said, the real trick or fly in the ointment might be achieving near-100% compatibility with the actual values in the tables, like for entity types that simply don't exist in CockroachDB.

@bdarnell
Copy link
Contributor

I think it would be difficult to get close enough to postgresql's system tables to be useful. In my work so far with django replacing the introspection code was only a small part of what was required.

@tbg
Copy link
Member

tbg commented Mar 25, 2016

Also ruins using rust-postgres, which queries a bunch of the meta tables https://github.com/sfackler/rust-postgres/blob/master/src/lib.rs#L466

at connect time:
https://sfackler.github.io/rust-postgres/doc/v0.11.4/src/postgres/src/lib.rs.html#459

Kind of a bummer to have to fork all of these.

@petermattis petermattis changed the title Support for "meta" postgresql tables? sql/pgwire: support for "meta" postgresql tables? Mar 31, 2016
@petermattis petermattis changed the title sql/pgwire: support for "meta" postgresql tables? sql: support for "meta" postgresql tables? Mar 31, 2016
@bdarnell
Copy link
Contributor

bdarnell commented Apr 8, 2016

The SQL standard defines an information_schema database which contains this kind of information in a vendor-neutral format. Implementing that would be more palatable than the postgres-specific tables. However, I'm not sure how beneficial it would be in practice - every time this has come up the library is using the pg-specific tables and not information_schema. I wonder if that's because they need some information that is not available in information_schema or just because they predated widespread support for information_schema (or the authors weren't aware of it).

@petermattis
Copy link
Collaborator

Looks like django uses information_schema for the MySQL backend. Is the backend tied to the client protocol for these ORMs?

@bdarnell
Copy link
Contributor

bdarnell commented Apr 8, 2016

It varies. Sometimes you can swap out the network protocol independently of the rest, sometimes it's configured as one big unit (and even if the protocol were separate, using the mysql introspection code would typically pull in a lot of other mysqlisms). In the specific case of django, you can't make this change without a custom plugin (which I have in a branch in cockroach-python), and once we're going that far it's easier to just implement the necessary introspection commands with SHOW TABLES and the like instead of sharing information_schema with the mysql backend.

@alex
Copy link
Contributor Author

alex commented Apr 8, 2016

I strongly suspect the answer is that folks are unaware of
infromation_schema, I'm sure if you sent a PR to djang to use it that'd be
accepted.

On Fri, Apr 8, 2016 at 3:48 PM, Ben Darnell notifications@github.com
wrote:

It varies. Sometimes you can swap out the network protocol independently
of the rest, sometimes it's configured as one big unit (and even if the
protocol were separate, using the mysql introspection code would typically
pull in a lot of other mysqlisms). In the specific case of django, you
can't make this change without a custom plugin (which I have in a branch in
cockroach-python), and once we're going that far it's easier to just
implement the necessary introspection commands with SHOW TABLES and the
like instead of sharing information_schema with the mysql backend.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#5194 (comment)

"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
GPG Key fingerprint: D1B3 ADC0 E023 8CA6

@bdarnell
Copy link
Contributor

bdarnell commented Apr 8, 2016

It's not just a lack of awareness (although it's possible that that's the root cause and no one has had any reason to update the old code): django uses both information_schema and pg_catalog in adjacent methods. Anyway, we don't support information_schema yet either, so there's not much point in trying to promote its use at this point.

@dmedri
Copy link

dmedri commented Apr 9, 2016

Outside the technical reasons for meta-fields (and the cited standards above), from an end-user point of view there are some valid points to have queryable metadata on table.

  1. Statisticians/Data Scientists: it could be very useful have info stored about the table/dataset, info about every columns/rows, a place for notes. Is what some statistical software did (eg. SPSS) in their native end format, and these are basically all TEXT fields.
  2. Data Warehouse Admin: useful meta-fields per-row could be the timestamps about latest insert/update. Obviously, in the traditional way all these fields are a good practice, but could be an useful addon as meta-fields, and native/internally triggered.

[RFC] meta-fields on tables for end-users

per-table meta-fields:

  • meta_table_id (UUID)
  • meta_table_name
  • meta_table_title
  • meta_table_subtitle
  • meta_table_description
  • meta_table_notes

per-column meta-fields:

  • meta_column_id (UUID)
  • meta_column_name
  • meta_column_title
  • meta_column_subtitle
  • meta_column_description
  • meta_column_notes

With this setting, visible rowname could link the meta_column_rowname value, be easy to change/modify. easy to move in the columns order.

per-row meta-fields:

  • meta_last_insert
  • meta_last_update

Values should be native/internally triggered on insert/update, as option, not by default.

Every cited meta-field should be queryable (select).
Almost all of these meta-fields should be ready for insert/update.

Comments are welcome.

HTH

@petermattis
Copy link
Collaborator

@dmedri Mind filing a separate issue for your feature request? It feels sufficiently different than the original issue which is about exposing existing table metadata in a way that is compatible with ORMs.

Tables and columns have associated IDs, but those IDs are not UUIDs. The table ID and column IDs are encoded in the keys used to store the rows for table data so space efficiency is a concern. Small integer IDs are much more space efficient than UUIDs.

@JackKrupansky
Copy link

It would be helpful if people could identify some specific ORM usage examples that attention could then focus on.

If I Google "ORM example", SQLAlchemy comes up on top:
http://docs.sqlalchemy.org/en/latest/orm/examples.html

@JackKrupansky
Copy link

One question about ORMs - is the interest primarily in simply having an object-oriented interface to traditional SQL data models, or... is the primary interest in how to map non-relational, object-oriented data structures to the relational data model. Looking at the SQLAlchemy example list, it seems like the latter. If that's the true goal of ORM support, fine, but the goal should be made explicit.

As a side comment, with CockroachDB and its flexible underlying KV store, focusing too heavily on ORM for supporting graphs and other object-oriented structures may be a distraction from eventually focusing on a native object-oriented keystore data model and direct, non-SQL data access. That said, ORM support seems like a relatively easy way to get a lot of traction in the near-term.

@alex
Copy link
Contributor Author

alex commented Oct 10, 2016

Looks like pg_catalog exists, and so do JOINs, so I suspect this can be closed.

@nvanbenschoten
Copy link
Member

We're still working on adding more tables to pg_catalog. Once we support all meta tables needed by major ORMs, then I think this can be safely closed.

@alex
Copy link
Contributor Author

alex commented Oct 10, 2016

@nvanbenschoten FWIW, the next place the django tests bail is the missing function pg_catalog.pg_table_is_visible

@knz knz added the A-sql-pgcompat Semantic compatibility with PostgreSQL label Oct 27, 2016
@jordanlewis
Copy link
Member

I'm going to close this issue in favor of more targeted issues for missing meta tables and functions, as I think we've gotten the bulk of them at this point. I've opened #12538 to track the missing pg_table_is_visible function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-pgcompat Semantic compatibility with PostgreSQL
Projects
None yet
Development

No branches or pull requests

9 participants