MSSQL Streaming does not handle backpressure #879

fehnomenal · 2024-02-12T11:56:26Z

First of all thank you for this great library and support for mssql. You literally saved my life.

I'm working with rather huge datasets (3.9M rows joined with other tables) which cannot be paged without risking timeouts. Unfortunately, I cannot influence the schema or indices so I have to work with what I got and that is streaming the query results. This is were I noticed that my app crashes with OOM depending on the number of selected columns (and rows of course).

Short digression about how streaming mssql works right now:
There is an array that is filled with rows as they come in from the tedious request.
A loop (with setTimeout 0) checks if the array is larger than the requested batch size (let's call it n) and resolves a deferred with the same number n of rows.

Now this is the problem as only the first n rows are passed to the application. If the application is fast in handling the batch this creates no problem. But otherwise the row buffer fills up and is not cleared fast enough, thus consuming all the memory.

I solved this problem locally by using a (nodejs) readable stream and pausing the tedious request if the stream cannot buffer more rows. This works really great but I could only test this for reading data as I have only read-only access to the database.

Are you interested in a PR? I'm not sure about other runtimes that kysely supports.

The text was updated successfully, but these errors were encountered:

igalklebanov · 2024-04-01T09:41:13Z

Hey 👋

Thank you!

Is using a stream absolutely necessary? Shouldn't using request.pause() and request.resume() suffice?

fehnomenal · 2024-04-01T15:59:57Z

Yes, I use Request.pause() and Request.resume(). I'm not familiar how to detect back pressure other than via stream, so that's what I went with.

Thinking about it, it might be enough to check the size of the rows array after splicing it and pausing the request if it is not empty. I suspect the stream implementation is doing something similar internally with a buffer array.

igalklebanov · 2024-04-01T16:50:09Z

~~Let's try the array length approach first. PRs welcome.~~

igalklebanov · 2024-06-15T23:23:15Z

And it's solved. Use #1041's changes for now. Array length + pause/resume did the trick.

igalklebanov added bug Something isn't working built-in dialect Related to a built-in dialect mssql Related to MS SQL Server (MSSQL) labels Mar 12, 2024

igalklebanov self-assigned this Jun 14, 2024

igalklebanov mentioned this issue Jun 14, 2024

fix MssqlDialect streaming not handling backpressure. #1041

Merged

igalklebanov closed this as completed in #1041 Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MSSQL Streaming does not handle backpressure #879

MSSQL Streaming does not handle backpressure #879

fehnomenal commented Feb 12, 2024

igalklebanov commented Apr 1, 2024

fehnomenal commented Apr 1, 2024 •

edited

Loading

igalklebanov commented Apr 1, 2024 •

edited

Loading

igalklebanov commented Jun 15, 2024 •

edited

Loading

MSSQL Streaming does not handle backpressure #879

MSSQL Streaming does not handle backpressure #879

Comments

fehnomenal commented Feb 12, 2024

igalklebanov commented Apr 1, 2024

fehnomenal commented Apr 1, 2024 • edited Loading

igalklebanov commented Apr 1, 2024 • edited Loading

igalklebanov commented Jun 15, 2024 • edited Loading

fehnomenal commented Apr 1, 2024 •

edited

Loading

igalklebanov commented Apr 1, 2024 •

edited

Loading

igalklebanov commented Jun 15, 2024 •

edited

Loading