Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbtoyaml - analyze view columns data types #184

Closed
excavador opened this issue Feb 21, 2018 · 14 comments
Closed

dbtoyaml - analyze view columns data types #184

excavador opened this issue Feb 21, 2018 · 14 comments
Labels
Milestone

Comments

@excavador
Copy link

excavador commented Feb 21, 2018

Would be very nice to see the view columns data types.

Example:

create table foo(
    a int unique
);

create table bar(
    b int primary key references foo(a)
);

create or replace view view_foo_as_bar as
select (row(foo.*)::bar).*, foo, 5 as const from foo;

actual output:

  view view_foo_as_bar:
    definition: " SELECT (ROW(foo.a)::bar).b AS b,\n    foo.*::foo AS foo,\n    5\
      \ AS const\n   FROM foo;"
    depends_on:
    - table foo
    owner: postgres

desired output:

  view view_foo_as_bar:
     columns:
       - b:
            type: integer
            references:
              schema: public
              table: foo
              column: a
        - foo: 
             type: public.foo
             references:
               schema: public
               table: foo
         - const:
             type: integer
    definition: " SELECT (ROW(foo.a)::bar).b AS b,\n    foo.*::foo AS foo,\n    5\
      \ AS const\n   FROM foo;"
    depends_on:
    - table foo
    owner: postgres

or something like that

@excavador excavador changed the title dbtoyaml - analyze view column data types dbtoyaml - analyze view columns data types Feb 21, 2018
@excavador
Copy link
Author

I suppose my team can prepare patch for that.
I need to know, it's possible to merge it to upstream (in theory) or we should consider forking.
Thank you.

@jmafc jmafc added the dbtoyaml label Feb 21, 2018
@jmafc
Copy link
Member

jmafc commented Feb 22, 2018

My response to this is more or less the same as for #183, except that it should be easier to fetch the view column information from pg_attribute at the same time we retrieve columns for regular/partitioned/foreign tables, so there's less concern with respect to performance degradation. Of course, if we do it for views, it should also apply to materialized views, perhaps mostly by inheritance. And the depends_on is already present on views.

@jmafc
Copy link
Member

jmafc commented Feb 22, 2018

BTW, on my system (PG 9.6.7), the third column of the view is named ?column? when you use "\d view_foo_as_bar" in psql. Did you redefine it as AS const before running dbtoyaml?

@excavador
Copy link
Author

excavador commented Feb 22, 2018

@jmafc yes, sorry, copy-pasted wrong sql file.

@tbussmann
Copy link

I don't see where it would be possible to extract the information for the references sections in the desired output. The columns could be constructed of arbitrary complex expressions. The type information can be found in pg_attribute as @jmafc explained.
What imho in the current implementation is missing is the dependency on the type bar in the depends_on clause. This dependency can be found in pg_depend. Quite handy queries to examine this data can be found in the PostgreSQL wiki.
I tied something like

SELECT --*,
    pg_describe_object(classid, objid, objsubid) AS "depender object identity",
    pg_describe_object(refclassid, refobjid, refobjsubid) AS "referenced object identity",
    CASE deptype
        WHEN 'p' THEN 'pinned'
        WHEN 'x' THEN 'auto extension'
        WHEN 'e' THEN 'extension'
        WHEN 'i' THEN 'internal'
        WHEN 'a' THEN 'automatic'
        WHEN 'n' THEN 'normal'
    END AS "dependency type"
FROM pg_catalog.pg_depend 
WHERE objid >= 16384 AND refobjid >= 16384
ORDER BY objid, objsubid, refobjid, refobjsubid
;

@excavador
Copy link
Author

pyrseas_bug=# \d view_foo_as_bar;
View "public.view_foo_as_bar"
 Column |  Type   | Modifiers
--------+---------+-----------
 b      | integer |
 foo    | foo     |
 const  | integer |

Seems like psql in some way detect view datatypes, and it's possible to investigate it from pyrseas without analyzing the columns expessions

@excavador
Copy link
Author

excavador commented Feb 22, 2018

So, I understand your position described in #183 and in general can agree.
But in this particular case adding extra information change nothing from dbtoyaml/yamltodb/other tools point of view, but very usefull for some another purposes.

We can easy extend this output, but in result we need to fork pyrseas or make some extra queries to postgresql after dbtoyaml run.

Would be very great extract this information and include it to yaml file.
It brokes nothing, but in our particular case helps a lot.

@jmafc
Copy link
Member

jmafc commented Feb 22, 2018

Oleg, psql gets its type info from pg_attribute (you can see the query if you use psql -E). We already query pg_attribute in pyrseas/dbobject/column.py so, as I mentioned above, it would be easy to get that for views and materialized views in the same query.

@excavador
Copy link
Author

@jmafc Oh, now got it. You will include this information to dbtoyaml output.
Thank you!

@jmafc
Copy link
Member

jmafc commented Mar 15, 2018

I have started to create a ViewToMapTestCase based on your example (except using table/view names t1, t2 and v1 and column names c1 and c2. The current definition, with Py3, is output as follows:

view v1:
  definition: |2-
     SELECT (ROW(t1.c1)::public.t2).c2 AS c2,
        t1.*::public.t1 AS t1,
        5 AS const
       FROM public.t1;
  depends_on:
  - table t1

Looking at your "desired output" however, I think we should only output the data types for the columns, i.e.,

  columns:
  - c2:
      type: integer
  - t1:
      type: t1
  - const:
      type: integer

That is, we will not include those references sections (references is not a good term because in SQL it FOREIGN KEY definitions and views don't abide by any referential integrity rules).

@excavador
Copy link
Author

@jmafc yes, that's good!
Thank you!

@jmafc
Copy link
Member

jmafc commented Mar 16, 2018

@tbussmann @rhunwicks @dvarrazzo
As expected, I was able to satisfy this request by adding relkind = 'v' to the Column.query

WHERE relkind in ('c', 'r', 'f', 'p')
and changing View.to_map
view = super(View, self).to_map(db, opts.no_owner, opts.no_privs)
to properly map the view columns (call Column.to_map in a list comprehension). The ViewToMapTestCase mentioned in my previous comment now passes.
However, although the view column datatype info may be, well, informative, it seems to me that it serves no purpose in the SQL generation side of things. IOW, if we read in 'view': 'columns' in yamltodb we'll either discard it (or carry it along but it's deadweight). If we decide to continue with the implementation of this feature, we'd have to extend it to materialized views (relkind = 'm') (Roger, that's why I added you--at some point, you had a special interest in matviews), and unless its output is made optional, we'd have to change ALL view and matview tests (25), and functional tests, for very little gain IMHO.
Therefore, I'm asking for second opinions on the desirability of implementing this.

@excavador
Copy link
Author

@jmafc sometime happens, when some good solution planned for one thing unexpectedly usable for another scenario.

Actual world of web programming widely covered by ORM which generated SQL code from general-purpose language, but about zero projects with opposite direction - generate general-purpose language code from SQL.

Single exception is https://github.com/begriffs/postgrest but it's too radical - direct connection from client/browser.

dbtoyaml covers this lack, and provide type information about columns for view is very usefull.
Based on dbtoyaml output we able to generate api definition, generate server and client code, automatically generate binding to storage functions, etcetera.

@jmafc
Copy link
Member

jmafc commented Mar 17, 2018

Oleg, my point in questioning the usefulness of this feature is that although, as you say, "dbtoyaml covers this lack", that's not what dbtoyaml was designed to do. It was designed to feed the YAML definition back into yamltodb so that it could (a) verify that another database (or perhaps the same database at a later point in time) still agreed with the definition and (b) failing the former, generate SQL to alter the database to match the YAML.
These view column datatype info didn't seem to serve a purpose compatible with the (a) and (b) goals. However, reviewing open issues I (re-)discovered #90 and the view column datatype info would assist in resolving that issue. So, as far as I'm concerned the second opinions are no longer needed (but I would welcome them anyway). I will proceed with implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants