Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements Request.pause() / resume() #518

Merged
merged 13 commits into from
Oct 12, 2017
Merged

Implements Request.pause() / resume() #518

merged 13 commits into from
Oct 12, 2017

Conversation

chdh
Copy link
Collaborator

@chdh chdh commented Mar 1, 2017

This pull request adds the two methods Request.pause() and Request.resume() to the public Tedious API. They can be used to pause and resume the flow of data rows from a query result. A test case is included.

See #181 for the reasons why this extension is important.

@chdh chdh changed the title Implements Request.pause() / resume() #181 Implements Request.pause() / resume() Mar 1, 2017
@Congelli501
Copy link

I tried you branch, and I have the following error using this code:

const Connection = require('tedious').Connection;
const Request = require('tedious').Request;

const connection = new Connection({
        server: '???',
        userName: '???',
        password: '???',
        options: {
                database: '???'
        }
});

connection.on('connect', (err) => {
        if (err) {
                throw err;
        }

        const request = new Request("select * from RANDOM_TABLE", (err2, rowCount) => {
                if (err2) {
                        throw err2;
                }

                console.log(rowCount + ' rows');
                connection.close();
        });

        request.on('row', (columns) => {
                console.log(columns.length);
                request.pause();
                setTimeout(() => {
                        request.resume();
                }, 500);
        });

        connection.execSql(request);
});

Error:

events.js:161
      throw er; // Unhandled 'error' event
      ^

Error: Received 'row' when no sqlRequest is in progress
    at Parser.<anonymous> (/root/node_tedious/node_modules/tedious/lib/connection.js:459:32)
    at emitOne (events.js:96:13)
    at Parser.emit (events.js:189:7)
    at Parser.<anonymous> (/root/node_tedious/node_modules/tedious/lib/token/token-stream-parser.js:54:15)
    at emitOne (events.js:96:13)
    at Parser.emit (events.js:189:7)
    at Parser.Readable.read (/root/node_tedious/node_modules/tedious/node_modules/readable-stream/lib/_stream_readable.js:391:26)
    at flow (/root/node_tedious/node_modules/tedious/node_modules/readable-stream/lib/_stream_readable.js:739:34)
    at resume_ (/root/node_tedious/node_modules/tedious/node_modules/readable-stream/lib/_stream_readable.js:722:3)
    at _combinedTickCallback (internal/process/next_tick.js:74:11)

The table contains 20 rows. I tried to pause the stream every 5 rows, but I had the same error.

I'm using node 7.6 on Ubuntu x64.
The SQL server has the following version:

Microsoft SQL Server 2005 - 9.00.1399.06 (Intel X86) 
        Oct 14 2005 00:33:37 
        Copyright (c) 1988-2005 Microsoft Corporation
        Developer Edition on Windows NT 6.1 (Build 7601: Service Pack 1)

Am I misusing the feature ?
Do you want me to open a bug request for that ?

Don't hesitate if you wan any more feedback / test.

@chdh
Copy link
Collaborator Author

chdh commented Mar 1, 2017

@Congelli501 Thanks for your bug report. I will analyze the problem and provide a fix.

@chdh
Copy link
Collaborator Author

chdh commented Mar 2, 2017

@Congelli501 I have fixed the problem. It happened when there were only a few records left when pause() was called. I have added another test case to verify that this problem is solved. Could you please test again?

@arthurschreiber
Copy link
Collaborator

arthurschreiber commented Mar 5, 2017

Hey @chdh,

Thank you very much for this PR - it definitely shows that pausing and resuming requests is possible in tedious with a bit of tweaking here and there. 👍

But as you mentioned over in #512:

I think that the internal architecture of the TDS protocol driver should be reworked.

I think so as well. Tedious internals should be a bunch of streams that are piped into each other, so that Request#pause() would be implemented in terms of pausing the "last" stream in this pipe. If I understand the way Node.JS streams work correctly, this should in turn cause all other streams to be paused due to reaching the highWatermark and automatic backpressure handling kicking in.

Would you be interested in helping to transform the tedious internals for this? I have played with a few ideas for this, and we could open an issue to discuss this further. If not, I believe these changes could be merged as they are, I'm just concerned that doing so might make tedious internals even more complex (and fragile) than they already are. 🤔

I'm scheduling this for the release after the planned 1.15.0 release, as I don't want to delay the 1.15.0 release even further.

/cc @tvrprasad What do you think?

@chdh
Copy link
Collaborator Author

chdh commented Mar 5, 2017

@arthurschreiber, thanks for your comment. I will try to simplify the PR.

@chdh
Copy link
Collaborator Author

chdh commented Mar 5, 2017

I have simplified the PR to apply backpressure (pause/resume) only to the last stream (the token stream parser transform). The current solution bridges the backpressure gap between the token stream parser transform and the packet stream transform. This bridging is only a few lines and can later be replaced by Stream.pipe().

That way we have a "minimal invasive" patch with the data flow control concept we want in the long term.

@ElfenLiedGH
Copy link

@chdh pause's great, but what do you think about timeout? should it pause too?
I take several record, then paused and then timeout, but query is Ok.
I'm not sure that's right

@chdh
Copy link
Collaborator Author

chdh commented Mar 6, 2017

@ElfenLiedGH Is it the request timeout? The default value for options.requestTimeout is 15 seconds. For a large table export or a complex query, this is too short anyway, whether pause/resume is used or not.

It would be good if we could override the options.requestTimeout value for a single request, maybe with a new method Request.setTimeout()?

@arthurschreiber
Copy link
Collaborator

arthurschreiber commented Mar 6, 2017

From the "Client Request Timer" section of the TDS specification:

Controls the maximum time spent waiting for a query response from the server for a client request sent after the connection has been established.

This means that as soon as we received the first packet from the response message for a query, we need to stop the request timer. If the user wants some sort of time out on the processing of the data, it's up to them to define a more suitable timer.

@chdh
Copy link
Collaborator Author

chdh commented Mar 6, 2017

as soon as we received the first packet from the response message for a query, we need to stop the request timer

You are right, but according to the documentation of SqlCommand.CommandTimeout in the .NET API, the timeout can also occur between two consecutive rows.

This property is the cumulative time-out (for all network packets that are read during the invocation of a method) for all network reads during command execution or processing of the results. A time-out can still occur after the first row is returned, and does not include user processing time, only network read time.
For example, with a 30 second time out, if Read requires two network packets, then it has 30 seconds to read both network packets. If you call Read again, it will have another 30 seconds to read any data that it requires.

I don't know how we could implement this correctly. We probably don't want to start a timer after each 'row' event.

@arthurschreiber
Copy link
Collaborator

You are right, but according to the documentation of SqlCommand.CommandTimeout in the .NET API, the timeout can also occur between two consecutive rows.

Weird. This is not how mssql-jdbc is implemented (it also follows the spec). I'd suggest we stick to the side of the specification.

@chdh
Copy link
Collaborator Author

chdh commented Mar 6, 2017

@ElfenLiedGH please have a look at the new version. Does it solve your problem?

@tvrprasad
Copy link
Contributor

@chdh @arthurschreiber None of the Microsoft implemented drivers implement the notion of pause/resume. The reasoning as I understand is that you can't really slow down how fast the server sends data as there is no support this throttling in TDS specification. So the memory pressure is going to build one place or the other in the system. Is there something different about NodeJS that it makes sense here? Curious to know.

@arthurschreiber Agree with not delaying release of 1.15. I'll need to read up on streaming in NodeJS before I can share thoughts on it :-)

@@ -522,7 +534,15 @@ class Connection extends EventEmitter {
this.socket.on('close', this.socketClose);
this.socket.on('end', this.socketEnd);
this.messageIo = new MessageIO(this.socket, this.config.options.packetSize, this.debug);
this.messageIo.on('data', (data) => { this.dispatchEvent('data', data); });
this.messageIo.on('data', (data) => {
this.clearRequestTimer();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do a separate PR for this bug fix? Though this got caught as part of this work, I think we can call it unrelated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I could put this into a separate PR, but it's not really a problem as long as you don't have a large result set or pause/resume. There was a real bug with clearRequestTimer() which I already fixed in PR #527.
@arthurschreiber what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate PR would be nice, so this can be part of the next release. ❤

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it's in #530 now.

src/request.js Outdated
// Returns true if this request is the currently active request of the connection.
isActive() {
return this.connection && this.connection.request === this && this.connection.state === this.connection.STATE.SENT_CLIENT_REQUEST;
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces pretty deep coupling between the Connection class and Request class. Perhaps this suggests pause/resume methods should be on the Connection class?

Copy link
Collaborator Author

@chdh chdh Mar 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have multiple active result sets (MARS) in the future, we need pause/resume per request. But you are right, the deep coupling is not clean. I will move this logic into the Connection class, but keep the pause/resume methods in the Request class.

It's a general problem that we cannot indicate public/protected/private with class methods and properties. Therefore I suggest to use TypeScript (see #512).

@chdh
Copy link
Collaborator Author

chdh commented Mar 7, 2017

@tvrprasad The need for pause/resume only arises in an asynchronous environment like Node.js. In the traditional programming environments with synchronous read-next calls, the data flow control is implicit. All Microsoft database drivers implement this implicit data flow control.

Example with MS-Access / VBA:

Dim rs As Recordset
Set rs = db.OpenRecordset("SELECT * FROM HugeTable", dbOpenForwardOnly)
Do While Not rs.EOF
  Debug.Print rs.Fields(0)
  ... wait a second ...
  rs.MoveNext
  Loop

In this example with MS-Access, the memory is not filled up, even if HugeTable has tens of millions of records. If you do the same thing with the current Tedious release, the memory fills up and after a few seconds the Node.js runtime will crash. Without pause/resume there is no way to control the data flow. The only solution is using SQL queries that don't produce a lot of data.

@@ -522,7 +534,15 @@ class Connection extends EventEmitter {
this.socket.on('close', this.socketClose);
this.socket.on('end', this.socketEnd);
this.messageIo = new MessageIO(this.socket, this.config.options.packetSize, this.debug);
this.messageIo.on('data', (data) => { this.dispatchEvent('data', data); });
this.messageIo.on('data', (data) => {
this.clearRequestTimer();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate PR would be nice, so this can be part of the next release. ❤

let paused = false;
openConnection();

function openConnection() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you refactor these test cases to make use of common 'setUp' and 'tearDown' functions? See the nodeunit documentation of you need more information for that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I have refactored the pause/resume test cases with setUp/tearDown.

@tvrprasad
Copy link
Contributor

In this example with MS-Access, the memory is not filled up, even if HugeTable has tens of millions of records.

Isn't the memory being filled up at the network layer (perhaps the TCP buffers), in that case? I assume the server is still sending data as fast as it can.

@arthurschreiber
Copy link
Collaborator

arthurschreiber commented Mar 7, 2017 via email

@tvrprasad
Copy link
Contributor

TCP has built in flow control and the sender will slow down sending data to
the receiver automatically.

Right, thought about that but it was not clear to me how that'd kick in here. The TCP layer on the client would have to stop sending ACKs while it's in the pause mode to throttle the server sending data. But then I thought that would cause TCP timeouts, retransmissions and ultimately errors on the server side. Now I see that the pause has to be really long for that to happen. Presumably server has mechanisms to handle slow clients. Makes sense now. Thanks :-)

Microsoft drivers like ADO.net support Async mode API but don't have pause/resume. So they'd be running into this issue, right? I wonder why there is not a demand for that support in those drivers.

@chdh
Copy link
Collaborator Author

chdh commented Mar 7, 2017

Microsoft drivers like ADO.net support Async mode API but don't have pause/resume. So they'd be running into this issue, right? I wonder why there is not a demand for that support in those drivers.

It's another concept. In the .NET API, you have to call SqlDataReader.ReadAsync() for each row. When you stop calling ReadAsync(), the row stream is paused implicitly. With Tedious and other Node libraries, the rows are continuously delivered via 'row' events and you have to call pause() explicitly.

@tvrprasad
Copy link
Contributor

@chdh Thanks for the clarification! Certainly learnt some stuff from this thread :-) I'll send you a few more comments on this PR.

addBuffer(buffer) {
return this.parser.write(buffer);
}

// Writes an end-of-message (EOM) marker into the parser transform input
// queue. StreamParser will emit a 'data' event with an 'endOfMessage'
// pseudo token when the EON marker has passed through the transform stream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EON => EOM

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np :-)

this.messageIo.on('data', (data) => {
this.clearRequestTimer();
const ret = this.dispatchEvent('data', data);
if (ret === false && this.state === this.STATE.SENT_CLIENT_REQUEST) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just move this chunk of code under SENT_CLIENT_REQUEST.data so we don't have to do these comparisons? The reliance on return value from dispatchEvent feels especially fragile.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will move it to SENT_CLIENT_REQUEST.data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

test.ok(!socketRs.flowing,
'Socket is not paused.');
}
test.ok(socketRs.length >= Math.min(socketRs.highWaterMark - 512, 0x4000),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please define consts for these numbers to describe what they represent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but it will not make it much clearer, because these are more heuristic values, not exact science. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A brief comment would be helpful.

const packetTransformRs = connection.messageIo.packetStream._readableState;
test.ok(!packetTransformRs.flowing,
'Packet transform is not paused.');
test.ok(packetTransformWs.length <= packetTransformWs.highWaterMark && packetTransformRs.length <= packetTransformRs.highWaterMark,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line too long. Perhaps you can break them into two assertions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.


function fail(msg) {
if (failed) {
return; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting - move } to new line.

Copy link
Collaborator Author

@chdh chdh Mar 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I wonder why the linter didn't see that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea. @arthurschreiber ?

}

function processRow(columns) {
if (canceled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line:193 validates rowCount for cancelled request but rowCount is not updated here in canceled state. Should it be? What's the expectation in terms of getting 'row' events after canceling a paused request?

Copy link
Collaborator Author

@chdh chdh Mar 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API documentation does not specify whether rows are emitted after cancel(). The current implementation does this, but it's not specified as part of the API. It also depends on what the server does when it receives the ATTENTION message. The current test case assumes that the reception of rows after cancel() is undefined and ignores them.

I don't know what to do in the special case when a request is paused at the time of the cancel. The current implementation releases the pause when the driver switches to the SENT_ATTENTION state. This has the effect that the remaining rows are emitted, until the server stops sending more rows and terminates the active request. We could change that behavior and block the remaining rows from being emitted, when the request was in a paused state at the time of the cancel.
@arthurschreiber what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From TDS spec:
"The client can interrupt and cancel the current request by sending an Attention message. This is also known as out-of-band data, but any TDS packet that is currently being sent MUST be finished before sending the Attention message. After the client sends an Attention message, the client MUST read until it receives an Attention acknowledgment. "

Either way, blocking or emitting remaining rows, seems consistent with the spec as long as we emit all the rows before sending the ECANCEL and the code seems to do that correctly. As far as I'm concerned we can keep the current behavior.

I would modify the test though to validate the current implementation. I think it's best to be intentional if we decide to change the implementation in future, and make the correspond test modification.

Copy link
Collaborator Author

@chdh chdh Mar 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some thoughts, I think it's better to suppress further 'row' events after a paused request has been canceled or timed out. This is what the application expects. If the application explicitly wants to receive the remaining rows on a paused request, it can call resume() before calling cancel().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this is a better approach. This prevents the possibility of an application bug where it chokes on 'row' events coming in after 'cancel'. Making it difficult to write bugs is good :-)

// This test reads only a few rows and makes a short pause after each row.
// The test verifies that:
// - Pause/resume works correctly when applied after the last packet of a TDS
// message has already been dispatched by MessageIO.ReadablePacketStream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see a pause/resume after each row. Which part of the code validates this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is implicitly tested since there is a pause after the last row...

Copy link
Collaborator Author

@chdh chdh Mar 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this test is a reaction to the problem reported by Congelli501 (second message in #518),
The problem was that the last packet (with the EOM mark set) has already been received and processed when there are still rows in the queue of the token parser transform. With the first test case which uses 200,000 rows and stops only once after 50'000 rows, the error could not be detected, because the end of the TDS message was still far away when stop() was called. In this second test case, all the rows probably fit within a single packet. After the first 100ms delay, the 'message' event has surely been emitted by the MessageIO class, because it has already detected the reception of the last packet of the message.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks for the explanation.


// Temporarily suspends the flow of data from the database.
// No more 'row' events will be emitted until resume() is called.
pause() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove pause/resume on Request class and leave them on Connection class only? These seem roughly equivalent of cancel() which is on Connection class only.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. From the API point of view, cancel() should be a member of the Request class. It's the request that is beeing canceled, not the connection.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable. Opened #534 to add 'cancel' on Request and deprecate it on Connection.

function onRequestCompletion(err) {
requestCount++;
if (requestCount == requestToCancel) {
test.ok(err && err.code == 'ECANCEL');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change '==' to '===' and '!=' to '!==' in the file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

sendDataToTokenStreamParser(data) {
return this.tokenStreamParser.addBuffer(data);
}

pauseRequest(request) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'request' parameter is not needed on pause/resume as there can only be one active request at a time on connection.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only an internal method that is called from Request.pause(). It has to check whether this Request object represents the currently active request. I will add a comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

// The test verifies that:
// - Pause/resume works correctly when applied after the last packet of a TDS
// message has already been dispatched by MessageIO.ReadablePacketStream.
// (EOM / packet.isLast() has already been detected when pause() is called.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: (EOM / packet.isLast() => (EOM / packet.isLast())

Copy link
Collaborator Author

@chdh chdh Mar 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the end-bracket it at the end of the paragraph. But I will remove the brackets and reformulate the sentences.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Thanks.

// This test reads only a few rows and makes a short pause after each row.
// The test verifies that:
// - Pause/resume works correctly when applied after the last packet of a TDS
// message has already been dispatched by MessageIO.ReadablePacketStream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is implicitly tested since there is a pause after the last row...

@chdh
Copy link
Collaborator Author

chdh commented Mar 9, 2017

@tvrprasad

Only the two new methods Request.pause() and Request.resume() are intended to be part of the public Tedious API. All other changes are internal. The reasons that these two methods are members of the Request class and not of the Connection class are:

  1. The 'row' events are emitted from the Request class and not from the Connection class. The pause/resume methods control these 'row' events. From the application point of view, the Request class represents the data source.

  2. It's true that the cancel() method is currently a member of the Connection class. But that is because it has been designed from the implementation point of view and not from the application point of view. In the upper level driver node-mssql, the cancel() method is a member of the Request class. In the .NET API, the Cancel() method ist a member of the SqlCommand class. It's not the connection that is beeing canceled, it's the request / command. Therefore the cancel() method should not be a member of the Connection class.

  3. If we later have multiple active requests at the Tedious API level (MARS or connection pooling), it must be clear which request is affected by the pause/resume call. The application logic uses the Request object as a handle for the data source and should not be forced to know the associated Connection object for using data flow control.

It's true that with the current implementation only a single Request object can be active at a time. But thats the internal view and we always have to think both sides, the internal implementation side and the external API user / application side. The application might still have old instances of Request objects that are no longer active. When pause/resume is called accidentally on an inactive Request object, the implementation must detect this and ignore the call (or throw an exception). This is the reason why the pause/resume methods check whether the associated Request object represents the currently active request.

I hope you understand the reasons for these design principles. I would like to also write a first version of a streaming BulkLoad implementation (#523), because I need that for my customer projects and I need it soon. But I don't have time to explain and justify everything in this detail.

@tvrprasad
Copy link
Contributor

@chdh Thanks for the detailed design rationale, makes sense and sounds good.

[rant:begin] It's too bad that we can't really stop anyone from using the internal pauseRequest/resumeRequest on the connection object. I guess that's life with JavaScript. [rant:end]

@Congelli501
Copy link

You can prepend protected methods by an underscore.
This convention is widely used and static code checkers (at least eslint) can check that you only use _ methods with this.

@tvrprasad
Copy link
Contributor

@Congelli501 Glad I ranted :-) Opened issue #536 to track this. Thanks!

@arthurschreiber arthurschreiber merged commit 9ef0d58 into tediousjs:master Oct 12, 2017
@arthurschreiber
Copy link
Collaborator

Heya @chdh,

thank you so much for this contribution! I cleaned up the test cases a bit and fixed one tiny issue with paused requests and connection closing, but I'm really happy to have these changes be finally part of tedious.

I'll review the other contributions that were made since the last release and push a new version including these changes as soon as possible. I'll also try to get streaming bulk data loads into another release shortly after.

Again, thank you so much for this contribution and sorry that it has taken so long to get this merged.

❤️

@chdh
Copy link
Collaborator Author

chdh commented Oct 12, 2017

@arthurschreiber
Thanks for your cleanup, extended testing and improvements. I'm glad that this PR will make it into the next Tedious release.

@jschell12
Copy link

Hey @arthurschreiber. Any word on when you'll release to npm?

@Suraiya-Hameed
Copy link
Member

@chdh pause-resume-test fails occasionally in CI, I have restarted the travis build couple of times on contributor's PR, forgot to copy the error stack though. I will share the message, if I notice it again for you to fix.

@Suraiya-Hameed
Copy link
Member

Here is the stack

AssertionError: false == true
    at Object.ok (/home/travis/build/tediousjs/tedious/node_modules/nodeunit/lib/types.js:83:39)
    at Timeout._onTimeout (/home/travis/build/tediousjs/tedious/test/integration/pause-resume-test.js:114:14)
    at ontimeout (timers.js:386:14)
    at tryOnTimeout (timers.js:250:5)
    at Timer.listOnTimeout (timers.js:214:5)

@arthurschreiber
Copy link
Collaborator

@v-suhame this is actually my fault. 😥 I’ll open a pull request to fix the flaky tests soon.

@chdh chdh deleted the pause-resume branch November 8, 2017 14:15
@chdh
Copy link
Collaborator Author

chdh commented Nov 8, 2017

@arthurschreiber @v-suhame I had similar effects with my version of the tests. I suggest to repeat the isPaused() test for a couple of times, with a delay for each retry.

@hosseinGanjyar
Copy link

hosseinGanjyar commented Dec 23, 2017

Is there a way for my problem?
tediousjs/node-mssql#471
Or please write a sample of pause/resume for node-mssql#streaming.
thanks a lot

@chdh
Copy link
Collaborator Author

chdh commented Dec 23, 2017

@hosseinGanjyar If pause/resume was implemented in node-mssql, you could call Request.pause() to stop the "row" events and Request.resume() to resume them. I think it would be easy to implement. Node-mssql could just forward the pause/resume calls to the Tedious driver. Without that, I can't see a way to call the pause/resume methods of the underlying Tedious driver.

@gsamal
Copy link

gsamal commented Mar 12, 2019

@chdh I am using the upper layer library node-mssql and they have added the pause/resume on the tedious driver. It all seems to be working until I noticed that even if request is paused, it keeps fetching rows and adds it in memory.
An easy way to reproduce this is after n rows, hit pause, then do nothing. The memory consumption keeps growing. Does this pause/resume work on the socket/db-connection level? or how does it work? Any ides why memory keeps rows as though it looks the record are still stacking up in memory even if it's paused.

@chdh
Copy link
Collaborator Author

chdh commented Mar 12, 2019

@gsamal The pause mechanism works by applying back-pressure on the internal chain of streams. Each stream continues to buffer data until the high water mark is reached. Then it applies back-pressure to the next preceding stream in the chain.

Could you open a new issue and supply some more detailed information about the effect you are investigating? How much does the memory grow after pause() was called. The best way would be to provide an isolated test case that allows us to reproduce the problem,

@gsamal
Copy link

gsamal commented Mar 13, 2019

@chdh Okay. But it seems to be keep adding records in a buffer somewhere, only pause the internal chain of streams. I can see a similar issue reported here - node-mssql#832. Can you please take a look at it?

In a simpler sense, if I pause the request after getting n records, then I monitor the network and memory, it keeps increasing.

@arthurschreiber
Copy link
Collaborator

@gsamal This was fixed via #878.

@arthurschreiber
Copy link
Collaborator

The fix for this is part of the 6.0.1 release. 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.