Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print result type information inspired by DuckDB DESCRIBE #190

Closed
apstndb opened this issue Sep 23, 2024 · 11 comments · Fixed by #191
Closed

Print result type information inspired by DuckDB DESCRIBE #190

apstndb opened this issue Sep 23, 2024 · 11 comments · Fixed by #191

Comments

@apstndb
Copy link
Collaborator

apstndb commented Sep 23, 2024

DuckDB has nice features about result set type.

Type names in table header.

D SELECT (1, 'foo')::STRUCT(i INTEGER, c VARCHAR) AS str LIMIT 0;
┌──────────────────────────────┐
│             str              │
│ struct(i integer, c varchar) │
├──────────────────────────────┤
│            0 rows            │
└──────────────────────────────┘

DESCRIBE query prints result set shape.

https://duckdb.org/docs/guides/meta/describe.html#describing-a-query

D DESCRIBE SELECT (1, 'foo')::STRUCT(i INTEGER, c VARCHAR) AS str LIMIT 0;
┌─────────────┬──────────────────────────────┬─────────┬─────────┬─────────┬─────────┐
│ column_name │         column_type          │  null   │   key   │ default │  extra  │
│   varchar   │           varchar            │ varchar │ varchar │ varchar │ varchar │
├─────────────┼──────────────────────────────┼─────────┼─────────┼─────────┼─────────┤
│ str         │ STRUCT(i INTEGER, c VARCHAR) │ YES     │         │         │         │
└─────────────┴──────────────────────────────┴─────────┴─────────┴─────────┴─────────┘

Are they useful and compatible in spanner-cli?

Type names in table header

I think it is very useful because it can achieve good balance of printing type informations and keep output to simple.

I had often tried to print correct type information in spanner-cli, but it was too verbose.

#104
#174

https://github.com/olekukonko/tablewriter supports multi-line header, so it can be implemented using it.

spanner> SELECT 1 AS i, "foo" AS s;
+-------+--------+
| i     | s      |
| INT64 | STRING |
+-------+--------+
| 1     | foo    |
+-------+--------+
1 rows in set (2.35 msecs)

DESCRIBE command

ResultSet.metadata.rowType will be returned even if QueryMode=PLAN.
It will be useful to check result set shape of queries without execution, especially query with query parameters.

DESCRIBE (and DESC) in spanner-cli is not documented synonym of EXPLAIN, so I think users will use EXPLAIN if meaning of DESCRIBE will be changed.

@apstndb
Copy link
Collaborator Author

apstndb commented Sep 27, 2024

I have implemented the prototype in #191

@yfuruyama
Copy link
Collaborator

Thank you for filing this. I basically agree with your proposition and have some comments.

  • Showing the type in each column by default looks too verbose. Maybe it's better to show it optionally, for example when --verbose option is used.
  • I think changing the semantic of DESCRIBE command would be safe. I guess most of users are using EXPLAIN to get the query execution plan.

@apstndb
Copy link
Collaborator Author

apstndb commented Sep 30, 2024

Thank you for response.

I think there are options to display column names and column types.

  • No column name
    • Current behavior of empty result.
  • Only column name
    • Current behavior of non-empty result.
  • Column name with simple type name like INT64, ARRAY<STRUCT>, ARRAY<PROTO>, ARRAY<ENUM>
  • Column name with verbose type name like INT64, ARRAY<STRUCT<A INT64, B STRUCT<C INT64>>>, ARRAY<google.spanner.v1.PlanNode.ChildLink>, ARRAY<google.spanner.v1.PlanNode.Kind>

I feel column name with verbose type can be too long, so it may be better to choose "column name with simple type name" even if --verbose.

  • There is no changes in no --verbose mode.
  • Column name with simple type name is printed in header when spanner-cli is --verbose mode even if empty result.
  • "Column name with verbose type name" is only displayed in DESCRIBE.

What do you think?

@yfuruyama
Copy link
Collaborator

  • There is no changes in no --verbose mode.
  • Column name with simple type name is printed in header when spanner-cli is --verbose mode even if empty result.
  • "Column name with verbose type name" is only displayed in DESCRIBE.

All sounds good. By the way, how does duckdb show the complicated column type like ARRAY<STRUCT<...>> in the result? Does it abbreviate at some level?

We also need to think about how to show the column type with vertical result mode (\G). If you have a chance, could you check how it works in duckdb?

@apstndb
Copy link
Collaborator Author

apstndb commented Sep 30, 2024

It seems that duckdb shows full type name in both of header and DESCRIBE, no matter how complex.

D SELECT str, str AS str_copy, str AS str_copy2 FROM (SELECT {foo: 'bar', list: [{key: 'hoge', value: 'piyo'}], arr:array_value({x: 123, y: 234, z: 345})} AS str) LIMIT 1;
┌──────────────────────┬────────────────────────────────────────────────────────────────────────────────────────────────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│         str          │                                              str_copy                                              │                                                     str_copy2                                                      │
│ struct(foo varchar…  │ struct(foo varchar, list struct("key" varchar, "value" varchar)[], arr struct(x integer, y integ…  │ struct(foo varchar, list struct("key" varchar, "value" varchar)[], arr struct(x integer, y integer, z integer)[1]) │
├──────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ {'foo': bar, 'list…  │ {'foo': bar, 'list': [{'key': hoge, 'value': piyo}], 'arr': [{'x': 123, 'y': 234, 'z': 345}]}      │ {'foo': bar, 'list': [{'key': hoge, 'value': piyo}], 'arr': [{'x': 123, 'y': 234, 'z': 345}]}                      │
└──────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

D DESCRIBE SELECT str, str AS str_copy, str AS str_copy2 FROM (SELECT {foo: 'bar', list: [{key: 'hoge', value: 'piyo'}], arr:array_value({x: 123, y: 234, z: 345})} AS str) LIMIT 1;
┌─────────────┬────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─────────┬─────────┬─────────┬─────────┐
│ column_name │                                                    column_type                                                     │  null   │   key   │ default │  extra  │
│   varchar   │                                                      varchar                                                       │ varchar │ varchar │ varchar │ varchar │
├─────────────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼─────────┼─────────┼─────────┼─────────┤
│ str         │ STRUCT(foo VARCHAR, list STRUCT("key" VARCHAR, "value" VARCHAR)[], arr STRUCT(x INTEGER, y INTEGER, z INTEGER)[1]) │ YES     │         │         │         │
│ str_copy    │ STRUCT(foo VARCHAR, list STRUCT("key" VARCHAR, "value" VARCHAR)[], arr STRUCT(x INTEGER, y INTEGER, z INTEGER)[1]) │ YES     │         │         │         │
│ str_copy2   │ STRUCT(foo VARCHAR, list STRUCT("key" VARCHAR, "value" VARCHAR)[], arr STRUCT(x INTEGER, y INTEGER, z INTEGER)[1]) │ YES     │         │         │         │
└─────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────┴─────────┴─────────┴─────────┘

and no type name in other output mode(default is duckbox).
https://duckdb.org/docs/api/cli/output_formats.html

D .mode line
D SELECT str, str AS str_copy, str AS str_copy2 FROM (SELECT {foo: 'bar', list: [{key: 'hoge', value: 'piyo'}], arr:array_value({x: 123, y: 234, z: 345})} AS str) UNION ALL SELECT NULL, NULL, NULL;
      str = {'foo': bar, 'list': [{'key': hoge, 'value': piyo}], 'arr': [{'x': 123, 'y': 234, 'z': 345}]}
 str_copy = {'foo': bar, 'list': [{'key': hoge, 'value': piyo}], 'arr': [{'x': 123, 'y': 234, 'z': 345}]}
str_copy2 = {'foo': bar, 'list': [{'key': hoge, 'value': piyo}], 'arr': [{'x': 123, 'y': 234, 'z': 345}]}

      str = 
 str_copy = 
str_copy2 = 

One interesting thing is duckdb limits column length to fit $COLUMNS of terminal. It may also be useful feature in spanner-cli.

Real complex type example in SPANNER_SYS(I know we don't usually use this value directly.)

spanner> SELECT LATENCY_DISTRIBUTION FROM SPANNER_SYS.QUERY_STATS_TOP_MINUTE LIMIT 1;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| LATENCY_DISTRIBUTION                                                                                                                                                   |
| ARRAY<STRUCT<COUNT INT64, MEAN FLOAT64, SUM_OF_SQUARED_DEVIATION FLOAT64, NUM_FINITE_BUCKETS INT64, GROWTH_FACTOR FLOAT64, SCALE FLOAT64, BUCKET_COUNTS ARRAY<INT64>>> |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

@yfuruyama
Copy link
Collaborator

Thank you for sharing various tests.

Regarding how we show the column type in the query result of --version option, I feel either is fine (showing full column type or reduced column type), but given that Spanner has PROTO type and it could have long name, I'm toward to showing the reduced column type.

One interesting thing is duckdb limits column length to fit $COLUMNS of terminal. It may also be useful feature in spanner-cli.

This might be another discussion, but I'm wondering how many users use non-default $COLUMNS variable. I checked now and I have been using default COLUMNS=80 variable inherited from the option of terminal app, which is too short compared to the window size of my laptop.

@apstndb
Copy link
Collaborator Author

apstndb commented Oct 1, 2024

This might be another discussion, but I'm wondering how many users use non-default $COLUMNS variable. I checked now and I have been using default COLUMNS=80 variable inherited from the option of terminal app, which is too short compared to the window size of my laptop.

It is strange because $COLUMNS shell variable is automatically adjuted to the width of the terminal in major environments like zsh and bash.
Anyway, I want to substitute $COLUMNS to the current width of the terminal.
(I think it is available in readline.GetScreenWidth())

@apstndb
Copy link
Collaborator Author

apstndb commented Oct 1, 2024

Regarding how we show the column type in the query result of --version option, I feel either is fine (showing full column type or reduced column type), but given that Spanner has PROTO type and it could have long name, I'm toward to showing the reduced column type.

My personal preference is to have more options for controlling the output format, but that's probably not going to be popular.

@yfuruyama
Copy link
Collaborator

Maybe we can consider adding more options once we find a popular use case.

I'll defer to you about which style to be used by default for column type: full column type, or reduced column type.

@apstndb
Copy link
Collaborator Author

apstndb commented Oct 1, 2024

Yeah, I want to choose "simple type name" as the default in headers, because:

  • Verbose type name can be longer than column value, it can be problem because there are complex types in many columns and terminal width is limited.
  • User can simply use DESCRIBE on the same query to see verbose type name of STRUCT, PROTO, ENUM.

@yfuruyama
Copy link
Collaborator

Sounds good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants