Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully functional database/sql driver #351

Merged
merged 1 commit into from
Jul 6, 2024

Conversation

brandur
Copy link
Contributor

@brandur brandur commented May 12, 2024

Here, implement the rest of driver functionality on riverdatabasesql,
the existing driver for Go's built-in database/sql package. Previously
it only supported a minimal interface allowing it to run migrations, but
nothing more sophisticated like inserting jobs.

The benefit of a fully functional driver is that it will allow River to
be integrated with with other Go database packages that aren't built
around Pgx like Bun (requested in #302) and GORM (requested in #58).
I'll need to write up some documentation, but this change should make
both of those integrations possible immediately.

It also lays the groundwork for future non-Postgres drivers. It's going
to be a little more still, but I want to take a stab at SQLite, and this
change will get us a lot of the way there.

There's no way with database/sql to support listen/notify, so here we
introduce the idea of a poll only driver. River's client checks whether
a driver can support listen/notify on initialization, and if not, it
enters poll only mode the same way as if configured with PollOnly.

An intuitive idiosyncrasy of this set up is that even when using the
database/sql driver bundled here, regardless of whether they're
working with Bun, GORM, or whatever, users will generally still be
using Pgx under the hood since it's the only maintained and fully
functional Postgres driver in the Go ecosystem. With that said, the
driver still has to bundle in lib/pq for various constructs like
pq.Array because we're using sqlc, and sqlc's database/sql driver
always uses lib/pq. I tried to find a way around this, but came out
fairly convinced that there is none. To rid ourselves of lib/pq
completely we'd need sqlc to ship an alternative Pgx driver that used
Pgx internally, but exposed a database/sql interface using *sql.Tx
instead of pgx.Tx.

@brandur brandur requested a review from bgentry May 12, 2024 11:23
@brandur brandur force-pushed the brandur-functional-database-sql branch 3 times, most recently from 2cc0c61 to 1a218e9 Compare May 12, 2024 22:40
@bgentry
Copy link
Contributor

bgentry commented May 13, 2024

There's no way with database/sql to support listen/notify, so here we
introduce the idea of a poll only driver. River's client checks whether
a driver can support listen/notify on initialization, and if not, it
enters poll only mode the same way as if configured with PollOnly.

I think we've discussed this possibility before, but wanted to raise it again as it's relevant here. If users are actually using pgx under the hood of database/sql or Bun, it shouldn't be too difficult for us to use a raw pgx connection for the listener—provided the driver's constructor gives us a way to create one.

We can't rely on a single conn never failing, so we'd need to at least be able to recreate a provided conn from its ConnConfig, OR we could accept a listener-specific pgxpool.Pool as part of the constructor so that we can steal conns from it as needed. We also talked about having different constructors within a single driver that give access to different functionality, so even if we don't add this now it doesn't preclude us doing so in the future. I think a change/addition like this would bring this database/sql driver pretty much to feature parity though!

With that said, the
driver still has to bundle in lib/pq for various constructs like
pq.Array because we're using sqlc, and sqlc's database/sql driver
always uses lib/pq. I tried to find a way around this, but came out
fairly convinced that there is none. To rid ourselves of lib/pq
completely we'd need sqlc to ship an alternative Pgx driver that used
Pgx internally, but exposed a database/sql interface using *sql.Tx
instead of pgx.Tx.

Dang, I wonder how tough of a lift this would be? Especially since iirc pgx has complete support for most of its types within database/sql. I tried searching for issues related to this and only found sqlc-dev/sqlc#2960 that was relevant.

Regarding the rest of this, I'm not gonna get through a full review this evening but should be able to get through it in the morning.

client.go Outdated
Comment on lines 1276 to 1185
// Restricted commas because we need those for batch inserts with
// the riverdatabasesql driver. We may want to restrict other
// special characters as well.
if strings.Contains(tag, ",") {
return nil, nil, errors.New("tags should not contain commas")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe tags should be restricted to /a-z_\-/i? possibly include single spaces in that but I don't think it would harm any use cases to leave that out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I added this variant instead:

var tagRE = regexp.MustCompile(`\A[\w][\w\-]+[\w]\z`)

Test matrix:

func TestTagRE(t *testing.T) {
	t.Parallel()

	require.Regexp(t, tagRE, "aaa")
	require.Regexp(t, tagRE, "_aaa")
	require.Regexp(t, tagRE, "aaa_")
	require.Regexp(t, tagRE, "777")
	require.Regexp(t, tagRE, "my-tag")
	require.Regexp(t, tagRE, "my_tag")
	require.Regexp(t, tagRE, "my_longer_tag")
	require.Regexp(t, tagRE, "My_Capitalized_Tag")
	require.Regexp(t, tagRE, "ALL_CAPS")
	require.Regexp(t, tagRE, "1_2_3")

	require.NotRegexp(t, tagRE, "a")
	require.NotRegexp(t, tagRE, "aa")
	require.NotRegexp(t, tagRE, "-aaa")
	require.NotRegexp(t, tagRE, "aaa-")
	require.NotRegexp(t, tagRE, "special@characters$banned")
	require.NotRegexp(t, tagRE, "commas,never,allowed")
}

# json/jsonb at all. Using a custom struct crashed and burned, even
# with a custom scanner implementation. This is the only way I could
# get it to work: strings are compatible with our use of bytes slices,
# but Postgrs will also accept them as json/jsonb.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# but Postgrs will also accept them as json/jsonb.
# but Postgres will also accept them as json/jsonb.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks.

@brandur brandur force-pushed the brandur-functional-database-sql branch 5 times, most recently from 183067f to 755295f Compare May 13, 2024 06:46
@brandur
Copy link
Contributor Author

brandur commented May 13, 2024

I think we've discussed this possibility before, but wanted to raise it again as it's relevant here. If users are actually using pgx under the hood of database/sql or Bun, it shouldn't be too difficult for us to use a raw pgx connection for the listener—provided the driver's constructor gives us a way to create one.

I don't love this — IMO, this sort of internal magic would be quite surprising and there'd be no intuitive way to predict that it might happen. It also puts a Pgx dependency back into riverdatabasesql — not the worst thing, but not great.

There's a much better way to do this which is that if you're using Pgx but still need database/sql in some places (say you're using Bun): (1) initialize one work client with riverpgxv5, and (2) initialize a second insert client that you use with Bun. Check out the latest blog post teed up in for the home page — there's a code sample in there of what it'd look like.

Dang, I wonder how tough of a lift this would be? Especially since iirc pgx has complete support for most of its types within database/sql. I tried searching for issues related to this and only found sqlc-dev/sqlc#2960 that was relevant.

I don't think it'd be that hard as far as adding drivers to Sqlc goes (overall, it doesn't seem to use that much lib/pq, just a few functions here and there), but any new Sqlc driver looks like kind of a lot of work :/ This is one that I'm kind of hoping that if we wait long enough someone will eventually do it, but certainly more than I want to commit to right now.

Regarding the rest of this, I'm not gonna get through a full review this evening but should be able to get through it in the morning.

Awesome, thx.

Copy link
Contributor

@bgentry bgentry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, awesome work! Also after proofing out #352 I'm feeling a lot better about shipping this PR, although I would still like to talk about whether we can ship that additional one before we blog about and promote this new database/sql support—I want as many people as possible to get the full River experience ⚡️

Comment on lines +1271 to +1186
} else {
for _, tag := range tags {
if len(tag) > 255 {
return nil, nil, errors.New("tags should be a maximum of 255 characters long")
}
if !tagRE.MatchString(tag) {
return nil, nil, errors.New("tags should match regex " + tagRE.String())
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably requires its own changelog entry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good call. Added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely a lot of copy-pasta type boilerplate in here, huh? Not sure there's much to be done about it though 😕

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's a little unfortunate :( I think we could improve some of it by having some shared helpers somewhere, but it's a little tricky since they're in different packages. Also, a lot of the code looks like perfect copy/pasta, but believe it or not isn't quite — there's lots of tiny differences that are required to make things work with database/sql.

Comment on lines +259 to +281
// `database/sql` has an `sql.Named` system that should theoretically work
// for named parameters, but neither Pgx or lib/pq implement it, so just use
// dumb string replacement given we're only injecting a very basic value
// anyway.
for name, value := range namedArgs {
newQuery := strings.Replace(query, "@"+name, fmt.Sprintf("%v", value), 1)
if newQuery == query {
return nil, fmt.Errorf("named query parameter @%s not found in query", name)
}
query = newQuery
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😵‍💫

@brandur brandur force-pushed the brandur-functional-database-sql branch 2 times, most recently from 4d4c914 to 7c7bac8 Compare July 5, 2024 06:03
Here, implement the rest of driver functionality on `riverdatabasesql`,
the existing driver for Go's built-in `database/sql` package. Previously
it only supported a minimal interface allowing it to run migrations, but
nothing more sophisticated like inserting jobs.

The benefit of a fully functional driver is that it will allow River to
be integrated with with other Go database packages that aren't built
around Pgx like Bun (requested in #302) and GORM (requested in #58).
I'll need to write up some documentation, but this change should make
both of those integrations possible immediately.

It also lays the groundwork for future non-Postgres drivers. It's going
to be a little more still, but I want to take a stab at SQLite, and this
change will get us a lot of the way there.

There's no way with `database/sql` to support listen/notify, so here we
introduce the idea of a poll only driver. River's client checks whether
a driver can support listen/notify on initialization, and if not, it
enters poll only mode the same way as if configured with `PollOnly`.

An intuitive idiosyncrasy of this set up is that even when using the
`database/sql` driver bundled here, regardless of whether they're
working with Bun, GORM, or whatever,  users will generally still be
using Pgx under the hood since it's the only maintained and fully
functional Postgres driver in the Go ecosystem. With that said, the
driver still has to bundle in `lib/pq` for various constructs like
`pq.Array` because we're using sqlc, and sqlc's `database/sql` driver
always uses `lib/pq`. I tried to find a way around this, but came out
fairly convinced that there is none. To rid ourselves of `lib/pq`
completely we'd need sqlc to ship an alternative Pgx driver that used
Pgx internally, but exposed a `database/sql` interface using `*sql.Tx`
instead of `pgx.Tx`.
@brandur brandur force-pushed the brandur-functional-database-sql branch from 7c7bac8 to 01aeece Compare July 6, 2024 00:51
@brandur
Copy link
Contributor Author

brandur commented Jul 6, 2024

@bgentry I need to base another PR on this one so gonna bring it for now, although won't cut a release right away.

@brandur brandur merged commit 91275aa into master Jul 6, 2024
10 checks passed
@brandur brandur deleted the brandur-functional-database-sql branch July 6, 2024 01:00
brandur added a commit to riverqueue/riverqueue-python that referenced this pull request Jul 6, 2024
…racters

Match up tag validations that came in with the main project here [1].
The multi insertion queries use commas to split tags during batch
inserts, so it's important that incoming tags don't have comms of their
own.

[1] riverqueue/river#351
brandur added a commit to riverqueue/riverqueue-python that referenced this pull request Jul 6, 2024
…racters

Match up tag validations that came in with the main project here [1].
The multi insertion queries use commas to split tags during batch
inserts, so it's important that incoming tags don't have comms of their
own.

[1] riverqueue/river#351
brandur added a commit to riverqueue/riverqueue-python that referenced this pull request Jul 6, 2024
…racters (#23)

Match up tag validations that came in with the main project here [1].
The multi insertion queries use commas to split tags during batch
inserts, so it's important that incoming tags don't have comms of their
own.

[1] riverqueue/river#351
brandur added a commit that referenced this pull request Jul 11, 2024
Removed outdated comment in `riverdatabasesql` that indicates that it's
not a fully functional driver, which is no longer accurate as of #351.
Also mention Bun and GORM in the Godoc.
brandur added a commit that referenced this pull request Jul 12, 2024
Removed outdated comment in `riverdatabasesql` that indicates that it's
not a fully functional driver, which is no longer accurate as of #351.
Also mention Bun and GORM in the Godoc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants