custom flow control and discard limit #2122

jerch · 2019-05-24T01:26:31Z

This PR introduces a new flow control mechanism based on recent changes in node-pty. The flow control works similar to the current XON/XOFF way with writing special messages to the backend to indicate whether data streaming should be paused/resumed. Main differences are customizable PAUSE/RESUME messages. Those will be filtered in node-pty to pause/resume the pty slave program by buffer back pressure.

Changes:

track write buffer state with a watermark
sanity check for fast incoming data with discard option (fixing the possible OOM)
high and low watermarks to pause/resume data flow

TODO:

determine default strings for PAUSE/RESUME (should work for all common platforms/envs)
API options, which should go there?
documentation / example usage / special notes for non node-pty backends
tests

Fixes #2077, #1918.

- adding watermark for write buffers - low and high watermark to toggle flow control - sanity watermark to discard data

vincentwoo · 2019-05-24T01:56:17Z

Very interesting stuff!

src/Terminal.ts

Tyriar · 2019-05-25T18:53:43Z

src/Terminal.ts

-    this.writeBufferUtf8.push(data);
+    // safety measure: dont allow the backend to crash
+    // the terminal by writing to much data to fast.
+    if (this._watermark > DISCARD_WATERMARK) {


I don't think we need this check here, jsdoc in API (and more docs when we improve that) should be enough

You mean like no DISCARD check at all? It is meant to stop malicious or faulty backends (those that ignore the PAUSE) from crashing ppls browsers. Sure we dont want that?

This could discard some important things which would corrupt the terminal, for example ls -lR / && vim may not enter the alt buffer. I guess the main reason I don't think it's needed is I haven't seen the demo crash for a long time and we could avoid this critical path check if people implement correctly.

What I see with ./fast_producer in the demo on master:

FF: +2.5 GB / min, kernel eventually kills it for being this memory hungry (happens for me around minute 3 to 5 in several runs)

Chrome: +1 GB / min in the beginning, after 2-3 min this drops back to ~250 MB and is stable there, devtools die after 30s

No clue what going on with Chrome, I think they are cheating once a process hit some memory limit (killing the Websocket? - idk, did not investigate further). Firefox shows the expected linear memory exhaustion.

I am abit uneasy about not having this security setting, since it might kill the browser (under FF at least the kernel killed the whole browser with all tabs). Maybe we should set DISCARD_WATERMARK higher even higher to avoid losing data in your example?

src/Terminal.ts

… 2077_new_flowcontrol

jerch · 2019-05-26T02:22:50Z

Did some fine tuning of the limits in the demo with the last commits (still contains debug logs).

There is a tradeoff between raw throughput speed and response to interrupts like Ctrl-C. Furthermore fast producers behave differently than slower ones due to different buffer fill states.

The numbers I have found are the compromise between no/low negative impact on frequently used semi fast producers like ls and still acceptable response time to Ctrl-C with fast ones (yes and ./fast_producer)

Currently I see the following:

ls string throughput slightly decreased (~ -7%)
ls utf8 throughput slightly increased (~ +10%)
Ctrl-C on ls is immediately
Ctrl-C on yes shows a tiny bit of latency
Ctrl-C on ./fast_producer shows a bigger but still acceptable latency

To get there I also had to introduce a buffer cap in the server.js pty --> xterm buffer, otherwise chunks from running ./fast_producer get to big within 5ms and the terminal shows a lag (decreased FPS).

@Tyriar Could you test if these numbers will do on MacOS? With its bigger pty buffer yes and ./fast_producer might show a big latency again for Ctrl-C.

Tyriar · 2019-05-26T17:25:46Z

Ctrl+C on yes/macOS I see the server pause immediately (logs stop) and then about 2 seconds before the renderer catches up.

jerch · 2019-05-26T19:31:13Z

@Tyriar Hmm 2s is quite long, but I cannot lower that much further without sacraficing throughput performance. Can we live with it taking this long? Imho yes is already an extrem example of a fast producer, things that typically have longer run times with quite noisy output are buildtools. Those should work pretty well with these settings.

Tyriar · 2019-05-27T03:49:36Z

@jerch I wasn't actually testing it right, it works great when I toggle useFlowControl on after
34c3a77

Tyriar · 2019-05-29T23:41:33Z

I don't think this is a WIP anymore right?

demo/server.js

src/Terminal.ts

Tyriar · 2019-05-29T23:55:46Z

fast_producer.c

+
+int main(int argc, char **argv) {
+  while (1) {
+    putchar('#');


yes seems to work fine, and I see the pausing and catching up in the log of yarn start. fast_producer however really starts to chug things and it's hard to tell if the pausing is working at all as all the logs say 1024.

Additionally I placed a console.log(this._watermark) in write to test the flowing with fast_producer.
I was expecting this to run less responsive on MacOS, MacOS has a dynamic pty buffer size depending on incoming data pressure (typically ranges from 16-64kB). Thus a single message will be like 64kB, alot to chew for xterm.js in one step.
fast_producer is an extreme example an IMHO not worth optimizing for. Since yes runs ok for you I think the numbers will do as they are?

… 2077_new_flowcontrol

jerch · 2019-05-31T18:45:05Z

@Tyriar Added a better fast producer snippet. This is fastest I was able to find with some more structured output (almost twice as fast as yes).

Edit: Wow, the PAUSE signal takes really long to get through in the demo, watermark easily goes up to 5MB, although it gets send at 120kB, WTH. This means with a slightly longer connection latency the watermark easily will go over 10MB.

Edit2: The reason is fairly simple, server.js already sent 2 - 7 MB before the PAUSE signal arrives in server.js. Once its seen the pty blocks rather fast. With worse latency the amount of already sent data will explode. Imho we can only circumvent this with the ACK idea.

Tyriar · 2019-05-31T20:33:33Z

src/Terminal.ts

+/**
+ * send ACK every ACK_WATERMARK-th byte
+ */
+const ACK_WATERMARK = 131072;//524288; // 2^19


I think we may want to instead send some request ack sequence from the server and then once this is hit in the parser we can send back. Extra awesome super-duper benefit of this is that it could all be done via the xterm.js API via a custom handler 😮

So server asks '\x1b^reqack;1\x1b\\' (not sure if an ID, the 1, is needed or not yet), then our custom ESC handler sends back '\x1b^ack;1\x1b\\'.

jerch · 2019-07-06T18:21:35Z

Closing this PR for now. We still have to work out the details for the offscreen core lib (some downgraded Terminal.ts thingy), prolly with some input service like thingy as well. Any attempts towards better flow control should go there once we have that.

custom flow control and discard limit:

2e7e774

- adding watermark for write buffers - low and high watermark to toggle flow control - sanity watermark to discard data

jerch added the work-in-progress Do not merge label May 24, 2019

make linter happy

d7ccd9f

Tyriar and others added 2 commits May 24, 2019 07:52

Merge branch 'master' into 2077_new_flowcontrol

5a2a867

Merge branch 'master' into 2077_new_flowcontrol

a1f1cce

Tyriar requested changes May 25, 2019

View reviewed changes

Tyriar and others added 5 commits May 25, 2019 11:56

Merge branch 'master' into 2077_new_flowcontrol

c2f8d77

fining tuning watermarks

97767f1

Merge branch '2077_new_flowcontrol' of github.com:jerch/xterm.js into…

58d22ed

… 2077_new_flowcontrol

make linter happy

a4ec9b3

fix PAUSE/RESUME in server.js

ca3e47d

Add useFlowControl as an option in demo

34c3a77

Merge branch 'master' into 2077_new_flowcontrol

ea0dadb

Tyriar added this to the 3.14.0 milestone May 29, 2019

Tyriar removed the work-in-progress Do not merge label May 29, 2019

Tyriar assigned jerch May 29, 2019

Tyriar requested changes May 29, 2019

View reviewed changes

jerch added 6 commits May 30, 2019 15:22

Merge branch 'master' into 2077_new_flowcontrol

b8d948f

comment discard watermark

43e2fb3

Merge branch '2077_new_flowcontrol' of github.com:jerch/xterm.js into…

5293d07

… 2077_new_flowcontrol

remove logs from demo

e366e1b

remove magic numbers

adf0488

throw error in discard limit

cdf9753

Tyriar removed this from the 3.14.0 milestone May 30, 2019

Tyriar added this to the 4.0.0 milestone May 30, 2019

Tyriar mentioned this pull request May 30, 2019

Create a node-pty host process with flow control and event batching microsoft/vscode#74620

Closed

jerch added 2 commits May 30, 2019 17:58

simplify demo

4460c6a

better fast_producer

1bcf2c5

ACK counting approach

4faedd6

Tyriar reviewed May 31, 2019

View reviewed changes

Tyriar mentioned this pull request Jun 5, 2019

Infinite history log on server #2181

Closed

jerch added 2 commits June 10, 2019 16:31

Merge branch 'master' into 2077_new_flowcontrol

4671566

make linter happy

cb2b745

jerch closed this Jul 6, 2019

jerch added the reference A closed issue/pr that is expected to be useful later as a reference label Jul 6, 2019

Tyriar removed this from the 4.0.0 milestone Aug 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

custom flow control and discard limit #2122

custom flow control and discard limit #2122

jerch commented May 24, 2019 •

edited

Loading

vincentwoo commented May 24, 2019

Tyriar May 25, 2019

jerch May 25, 2019 •

edited

Loading

Tyriar May 26, 2019

jerch May 26, 2019

jerch commented May 26, 2019 •

edited

Loading

Tyriar commented May 26, 2019

jerch commented May 26, 2019

Tyriar commented May 27, 2019

Tyriar commented May 29, 2019

Tyriar May 29, 2019

jerch May 30, 2019

jerch commented May 31, 2019 •

edited

Loading

Tyriar May 31, 2019

jerch commented Jul 6, 2019

custom flow control and discard limit #2122

custom flow control and discard limit #2122

Conversation

jerch commented May 24, 2019 • edited Loading

vincentwoo commented May 24, 2019

Tyriar May 25, 2019

Choose a reason for hiding this comment

jerch May 25, 2019 • edited Loading

Choose a reason for hiding this comment

Tyriar May 26, 2019

Choose a reason for hiding this comment

jerch May 26, 2019

Choose a reason for hiding this comment

jerch commented May 26, 2019 • edited Loading

Tyriar commented May 26, 2019

jerch commented May 26, 2019

Tyriar commented May 27, 2019

Tyriar commented May 29, 2019

Tyriar May 29, 2019

Choose a reason for hiding this comment

jerch May 30, 2019

Choose a reason for hiding this comment

jerch commented May 31, 2019 • edited Loading

Tyriar May 31, 2019

Choose a reason for hiding this comment

jerch commented Jul 6, 2019

jerch commented May 24, 2019 •

edited

Loading

jerch May 25, 2019 •

edited

Loading

jerch commented May 26, 2019 •

edited

Loading

jerch commented May 31, 2019 •

edited

Loading