Minimal slonik patch for closing stream queries (release) #604

alxndrsn · 2022-09-12T06:49:56Z

WIP

matthew-white · 2022-09-13T05:23:25Z

I checked out this PR on the QA server, then followed the repro steps in #482. I'm still able to reproduce a database connection leak by following those steps. However, I actually think that's expected. This patched version of Slonik makes it possible to destroy a stream, but we still need to leverage that ability by making other changes. After discussing with @alxndrsn, it seems like we have three options in this area:

Wherever we use Promise.all() with a stream, instead use a utility function like the one at 10 concurrent requests for OData feed results in database connection leak #482 (comment)
Avoid using Promise.all() with a stream (10 concurrent requests for OData feed results in database connection leak #482 (comment))
Automatically destroy streams that haven't had a recent read (see Patch slonik to close streams more reliably #599)

If we want to ship this patched version of Slonik with the upcoming Central patch release, we should also ship one of the three changes above. Ultimately, we'll probably want to make more than one of these changes: I think we want to implement (3), then also either (1) or (2).

matthew-white · 2022-09-14T03:39:10Z

We discussed this in the meeting today, and we're currently thinking that we should ship one of the three options above as part of the patch release. Whichever option we choose, I think we could gain confidence in the change by running the benchmarker against it. @lognaturel also made the point that if streams somehow stop working, that's obviously not a good thing, but there's actually only so much risk there. (It's not like it's a write operation and data would be corrupted.) How does shipping one of those options sound, @alxndrsn? Do you have a preference among them? I don't have a strong preference myself, but let me know if there's a way for me to help. If option (1) seems preferable, I'd be happy to continue working on that utility function.

matthew-white · 2022-09-15T06:35:59Z

I think regardless of whether we ship #608 or the timeout functionality in #599, we should go ahead and merge this PR. @alxndrsn, does that sound right? Is there more that needs to be done for this PR specifically?

alxndrsn · 2022-09-15T17:08:37Z

@matthew-white because this PR wasn't useful, I've re-added stream timeouts, with tests.

matthew-white · 2022-09-15T21:53:20Z

@alxndrsn ran the benchmarker for a few different options. Each benchmarker run took 300 sec.

	Total requests	Successes	Throughput (req/s)	Mean response time (sec)	Min response time (sec)	Max response time (sec)	Response size
#608	999	11 (1%)	0.0	21.6	16.8	23.4	818,805
Stream timeout (2 min)	999	53 (5%)	0.2	22.2	13.3	29.9	827,011
Stream timeout (45 sec)	1,000	180 (18%)	0.6	25.0	14.0	36.4	871,938
Stream timeout (5 sec)	1,000	205 (20%)	0.7	27.2	15.2	36.9	872,616

@alxndrsn wrote on Slack about #608:

basically throughput is halved... but there don't seem to be any connection leaks

matthew-white · 2022-09-16T21:00:30Z

@alxndrsn taught me how to run the benchmarker, so I've been running it in CircleCI. I ran it for #608, #609, and stream timeouts (this PR), as well as for v1.5.1 and the current release branch. Very surprisingly to me, #609 didn't seem to do any better than #608 (and maybe even did a little worse). Like #608, it performed worse than stream timeouts. (Note that in some cases, I ran the benchmarker multiple times for the same commit and got different results. Below I list the more successful results.)

	CircleCI build	Successes	Throughput (req/s)
v1.5.1	[1][2][3]	29 (2%)	0.1
Current `release` branch - no patch to Slonik (`ae09ec5`)	[1][2]	34 (3%)	0.1
#608	[1][2]	57 (5%)	0.2
#609	[1][2]	33 (3%)	0.1
Stream timeout (2 min) - this PR	[1][2]	105 (10%)	0.3

Unlike #608, #609 doesn't wait until after a Promise.all() to create a stream, so I'm confused about why it's doing so much worse than stream timeouts. Stream timeouts would perform especially well when there's an issue with the client reading data. Is there a chance that the benchmarker just isn't consuming all the streams it's requesting? I changed the benchmarker to send 3 requests every 3 seconds (300 requests total) rather than 10 requests every 3 seconds (1000 requests total). I saw pretty different results:

	CircleCI build	Successes	Throughput (req/s)
v1.5.1	[1][2][3]	208 (69%)	0.7
Current `release` branch - no patch to Slonik (`ae09ec5`)	[1][2]	244 (81%)	0.8
#608	[1][2][3]	241 (80%)	0.8
#609	[1][2]	280 (93%)	0.9
Stream timeout (2 min) - this PR	[1][2]	257 (85%)	0.9
#609 + stream timeout	[1][2]	241 (80%)	0.8

(One surprising result: #609 + stream timeouts did worse than either one on its own!)

Based on the results above, I'm thinking:

There's some variability in the benchmarker results. That means that it's worth running the benchmarker multiple times and that it's harder to ascribe small differences in results to real differences in performance.
Fixing the database connection leaks might not be making a huge difference in these results, but it does seem like the various fixes at least aren't breaking anything.
Stream timeouts fix cases that Create streams only after Promise.all() #608 and Destroy streams created in rejected Promise.all() #609 don't fix: stream timeouts seem to be performing markedly better than the other options in the first table.

matthew-white

See #599 for relevant discussion.

We're going to ship this PR with the patch release. We may also continue looking into #608 and #609.

matthew-white · 2022-09-23T05:30:54Z

There's some variability in the benchmarker results.

@alxndrsn thinks it's possible that some of that variability is from CircleCI itself, specifically the state of the machine on which the benchmarker is run. That's something I'll keep in mind once I continue working on #609.

wip

d273b6d

alxndrsn changed the title ~~wip~~ Minimal slonik patch for closing stream queries (release) Sep 12, 2022

matthew-white mentioned this pull request Sep 12, 2022

remove slonik, insert postgres.js #564

Closed

matthew-white mentioned this pull request Sep 15, 2022

Create streams only after Promise.all() #608

Closed

2 tasks

alxndrsn added 2 commits September 15, 2022 19:31

Add timeout

cf5fd8d

roll back package.json re-ordering

368954c

This was referenced Sep 16, 2022

Destroy streams created in rejected Promise.all() #609

Closed

db.stream(): add timeout #566

Closed

matthew-white marked this pull request as ready for review September 19, 2022 16:58

matthew-white mentioned this pull request Sep 19, 2022

Patch slonik to close streams more reliably #599

Merged

matthew-white approved these changes Sep 19, 2022

View reviewed changes

matthew-white merged commit e40044a into getodk:release Sep 19, 2022

This was referenced Sep 19, 2022

10 concurrent requests for OData feed results in database connection leak #482

Closed

Request soon after premature close can result in database connection leak #485

Closed

db.stream() has no timeout #565

Closed

matthew-white mentioned this pull request Oct 4, 2022

Immediately destroy stream created in rejected Promise.all() #634

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimal slonik patch for closing stream queries (release) #604

Minimal slonik patch for closing stream queries (release) #604

alxndrsn commented Sep 12, 2022

matthew-white commented Sep 13, 2022

matthew-white commented Sep 14, 2022

matthew-white commented Sep 15, 2022

alxndrsn commented Sep 15, 2022

matthew-white commented Sep 15, 2022 •

edited

Loading

matthew-white commented Sep 16, 2022 •

edited

Loading

matthew-white left a comment •

edited

Loading

matthew-white commented Sep 23, 2022

Minimal slonik patch for closing stream queries (release) #604

Minimal slonik patch for closing stream queries (release) #604

Conversation

alxndrsn commented Sep 12, 2022

matthew-white commented Sep 13, 2022

matthew-white commented Sep 14, 2022

matthew-white commented Sep 15, 2022

alxndrsn commented Sep 15, 2022

matthew-white commented Sep 15, 2022 • edited Loading

matthew-white commented Sep 16, 2022 • edited Loading

matthew-white left a comment • edited Loading

Choose a reason for hiding this comment

matthew-white commented Sep 23, 2022

matthew-white commented Sep 15, 2022 •

edited

Loading

matthew-white commented Sep 16, 2022 •

edited

Loading

matthew-white left a comment •

edited

Loading