feat: go type mapping per query argument #73

0xjac · 2022-08-18T11:40:37Z

Extends the query annotation to specify custom go types using the same
qualified go type format as for the general custom type mapping.

Closes #46

Extends the query annotation to specify custom go types using the same qualified go type format as for the general custom type mapping. Closes jschaf#46

0xjac · 2022-08-18T12:03:04Z

@jschaf This is something I need for one of my use case (see #46 for details) so I'm working on an implementation. It's missing the mapping of return columns, tests, and docs, but you can have a look at the mapping of query arguments (inputs) already if you have the time.

If you think this is something you are definitely not interested in merging, I'd appreciate if you would let me know so I don't waste too much time on it 😉

jschaf

Thank you for diving into the codebase. I'm in favor of the change so happy to keep working with you on this PR.

The main thing I'm focused on is how to present the options for the user to keep the interface as simple as possible. I've had a number of requests which essentially boil down to more configurable type mappings (#65). I think the answer is to support an intermediate unmarshal type that's responsible for converting the Postgres type into the destination Go type, so timestamptz to time.Time in the below example.

--go-type timestamptz=github.com/jschaf.TimeUnmarshaller=time.Time

I'd like the syntax to be pretty similar between the query files and command line so I'm interested in what the SQL file would look like (ignoring the intermediate idea since that's out of scope for this work).

-- name: FooBar go-type: timestamptz=time.Time :many

jschaf · 2022-08-18T19:59:10Z

internal/parser/parser.go

+ paramTypeOverrides := make(map[string]string, 4)
+ annotationPos := len(doc.List) - 1
+
+ for ; annotationPos > 0; annotationPos-- {


This is a bit messier than I'm comfortable with. Might be time to do a proper parse, something like:

split the line on whitespace

parse each token

I was taking insipiration from how the name: :kind annotation was parsed but I agree it is a bit messy and can be improved. (N.B.)I've moved the parsing of go types into it's own function to clean things up already. I just need to push it.

Can you elaborate on what you mean by proper parser? Do you want a custom built DSL and parser or do you have something in mind to be reused?
One idea would be to find the name: line, then parse everything that's afterwards in a known language like TOML (and we can enforce the structure of it). This would have the advantage to reusue a known configuration language and libs to parse it. However while it's close to the CLI it is a bit different, notably spaces around the equal are allowed and the values (i.e. go types) have to be quoted.

Which way would you like to go?

To give an example of what I mean with TOML, it would work like this:

-- name: GetDailyFlightsFromAircraft :many -- [arg] -- day = "github.com/0xjac/custom-project/types.Day" -- [return] -- departure = "time.Time" -- eta = "*time.Time" SELECT f.flight_number, f.departure, f.arrival, f.eta FROM flights f WHERE pggen.arg("day") <= f.departure AND f.departure < pggen.arg("day") + 1 ORDER BY f.departure DESC;

We could also add the name and kind as top-level keys but it should still support then current name: notation for compatibility.

jschaf · 2022-08-18T20:00:12Z

internal/ast/ast.go

+ SourceSQL string // the complete sql query as it appeared in the source file
+ PreparedSQL string // the sql query with args replaced by $1, $2, etc.
+ ParamNames []string // the name of each param in the PreparedSQL, the nth entry is the $n+1 param
+ ParamTypeOverrides map[string]string // map of Go type to override the Pg type of a param


nit: I think this can be ParamGoTypes with a doc of something like:

map of user-specified Go types to use for the arg.

Sounds good, and for the return value I would then name it accordingly:

ResultGoTypes map[string]string // map of user-specified Go type for the result columns.

0xjac · 2022-08-19T11:58:28Z

Thank you for diving into the codebase. I'm in favor of the change so happy to keep working with you on this PR.

The main thing I'm focused on is how to present the options for the user to keep the interface as simple as possible. I've had a number of requests which essentially boil down to more configurable type mappings (#65). I think the answer is to support an intermediate unmarshal type that's responsible for converting the Postgres type into the destination Go type, so timestamptz to time.Time in the below example.
--go-type timestamptz=github.com/jschaf.TimeUnmarshaller=time.Time
I'd like the syntax to be pretty similar between the query files and command line so I'm interested in what the SQL file would look like (ignoring the intermediate idea since that's out of scope for this work).
-- name: FooBar go-type: timestamptz=time.Time :many

Interesting. I can definitely adapt the parsing to be more like the command line arguments. The current format matches what is specified in #46, which in turn is inspired by the current (name: ...) annotation. Taking into account your feedback, I would propose something like:

-- name: GetDailyFlightsFromAircraft :many
-- arg: day=github.com/0xjac/custom-project/types.Day
-- return: departure=time.Time
-- return: eta=*time.Time
SELECT f.flight_number, f.departure, f.arrival, f.eta
FROM flights f
WHERE pggen.arg("day") <= f.departure AND f.departure < pggen.arg("day") + 1
ORDER BY f.departure DESC;

Few things to notice:

The left side of the equal refers to an arg name (for arg:) or a column name (for return:) not a pg type.
The name:, arg:, return: annotations should be in this order. I.e. no arg: after a return:. However the order amongst multiple arg: and return: is free.
Each type mapping should be on a separate line (for readability)
The arg/return type mapping takes precedence, but is not mandatory. If omitted the global type mapping (--go-type via CLI) is considered and then the type inference (as is already the case). Here the f.arrival column will have the go type from which ever globally mapped or inferred type there is for a TIMESTAMP.
The user can specify a go type which is incompatible with the underlying pg type. The type conversion will fail at runtime. (Similar to a bad type map provided globally).

The intermediate unmarshal type idea is interesting but indeed out of scope. Here is a few thoughts that might help tho:
Using a combination of query specific type map on the return return: (once I've implemented it) and ROW() to return a "single column" and map it to a single custom made struct which implements something like the BinaryDecoder and TextDecoder interface and contains all the unmarshalling logic, it should be possible to unmarshall into pretty much any thing with little to no work on the pggen side. And it makes the config easier: No need to provide an intermediary type. The drawback is that the user here might have to write some structs for the result manually and do the whole decoding logic by hand.

jschaf · 2022-08-23T15:54:11Z

Nice, detailed writeup!

Agreed that we probably need multi-line with this change. Couple tweaks:

For ease of parsing, I'm in favor of using a pggen prefix to identify metadata lines. Otherwise, we'd need to do something like find the last instance of --name and require all lines after it begin with pggen-approved prefix.
I think columnType reads better than return. Then for consistency, arg becomes argType.

-- pggen: name: GetDailyFlightsFromAircraft :many
-- pggen: argType: day=github.com/0xjac/custom-project/types.Day
-- pggen: columnType: departure=time.Time

I think you nailed the precedence. Enforcing order of argType and columnType is a good thing to enforce early.

I approve of the design with the naming changes (arg -> argType and return -> columnType) and pggen prefix on comment lines. Happy to debate if you feel strongly otherwise.

Sounds like you have a reasonable handle on how to approach the implementation but let me know if a high-level breakdown would help. From my end, I'll be looking closely at:

not breaking backwards compatibility
testing coverage

0xjac · 2022-08-23T16:17:07Z

@jschaf Overall it all seems good. There are just a few things to note:

For ease of parsing, I'm in favor of using a pggen prefix to identify metadata lines. Otherwise, we'd need to do something like find the last instance of --name and require all lines after it begin with pggen-approved prefix.

No problem with using a prefix, then the algorithm would be going through each line with a pggen prefix and parse them from first to last. The annotations should still be one by line and in the order: name, argType, columnType.
Just one question about it: Do we allow "comment" lines without the pggen: prefix in-between or after the lines with the prefix?
In your example you wrote:
```
-- pggen: name: GetDailyFlightsFromAircraft :many
```
But this breaks compatibility because of the new prefix. Therefore, I propose we support a name: line both with or without the pggen: prefix.
This would work as follow:
1. Find the first line with the pggen: prefix.
2. If that line is a name: line, then parse it and move on. (Any name: line above without prefix is ignored, as is currently the case.)
3. If that line is not a name: prefix, parse the previous line expecting a name : without the pggen: prefix. Then move on to parsing the current line as an argType: or columnType:
Unlike the current annotation where name: lines can be repeated and the last one is taken into account, for lines with the pggen: name: prefix, it can only appear once as the first line with the pggen: prefix.

Regarding tests, I will obviously provide tests with this PR. I'm still in the playground/specs phase now but once that is done, I'll start adding the required tests.

jschaf · 2022-08-23T16:42:11Z

Do we allow "comment" lines without the pggen: prefix in-between or after the lines with the prefix?

I think not. Easier to start strict and allow flexibility later.

But this breaks compatibility because of the new prefix. Therefore, I propose we support a name: line both with or without the pggen: prefix.

Agreed, we're aligned there.

Regarding tests, I will obviously provide tests with this PR. I'm still in the playground/specs phase now but once that is done, I'll start adding the required tests.

Sweet, sounds like a plan.

feat: go type mapping per query argument

ff474e2

Extends the query annotation to specify custom go types using the same qualified go type format as for the general custom type mapping. Closes jschaf#46

jschaf reviewed Aug 18, 2022

View reviewed changes

jschaf closed this Jan 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: go type mapping per query argument #73

feat: go type mapping per query argument #73

0xjac commented Aug 18, 2022

0xjac commented Aug 18, 2022

jschaf left a comment

jschaf Aug 18, 2022

0xjac Aug 19, 2022

0xjac Aug 19, 2022 •

edited

Loading

jschaf Aug 18, 2022

0xjac Aug 19, 2022

0xjac commented Aug 19, 2022

jschaf commented Aug 23, 2022

0xjac commented Aug 23, 2022

jschaf commented Aug 23, 2022

feat: go type mapping per query argument #73

feat: go type mapping per query argument #73

Conversation

0xjac commented Aug 18, 2022

0xjac commented Aug 18, 2022

jschaf left a comment

Choose a reason for hiding this comment

jschaf Aug 18, 2022

Choose a reason for hiding this comment

0xjac Aug 19, 2022

Choose a reason for hiding this comment

0xjac Aug 19, 2022 • edited Loading

Choose a reason for hiding this comment

jschaf Aug 18, 2022

Choose a reason for hiding this comment

0xjac Aug 19, 2022

Choose a reason for hiding this comment

0xjac commented Aug 19, 2022

jschaf commented Aug 23, 2022

0xjac commented Aug 23, 2022

jschaf commented Aug 23, 2022

0xjac Aug 19, 2022 •

edited

Loading