Terminal should provide API to notify data cleaned by control sequences #29840

dbaeumer · 2017-06-29T10:45:55Z

I noticed that the terminal no sends Operating System Commands (OSC) like setting the window title. Right now the task runner filters them but this gets more and more complex and I am actually not an expert with this. So I would like to request that the terminal provides API to notify data without control sequences. Currently I used these two regular expressions to filter them:

	private static ANSI_CONTROL_SEQUENCE: RegExp = /\x1b[[()#;?]*(?:\d{1,4}(?:;\d{0,4})*)?[0-9A-ORZcf-nqry=><]/g;
	private static OPERATING_SYSTEM_COMMAND_SEQUENCE: RegExp = /\x1b[\]](?:.*)(?:\x07|\x1b\\)/g;

Tyriar · 2017-06-29T15:19:26Z

There are two unknowns to me right now with this:

Various control sequences can do a whole lot of stuff, so where is the best place to gather this data? It might be worth seeing if adding listeners to addChar and lineFeed will give what you need.
How do this efficiently. Because a lot of data can stream through the terminal, we should probably only enable this when it's explicitly requested (ie. only fire the event if something is listening). It should probably also be batched.

PR welcome upstream.

dbaeumer · 2017-06-30T07:33:05Z

From a problem matcher point of view it would be enough sending the line. We could sent it when a linefeed is added. Then there is still the problem of a line content changing. We once discussed the idea of a line identifier to signal that. I agree that sending single characters might not be a smart idea in terms of performance.

Tyriar · 2017-06-30T16:45:28Z

If it's just on line feed then many characters could go missing, eg. "abc" "def" would result in "def" only. Maybe this is not something we care about, but it could lead to some obscure bugs if people relied on this new API.

dbaeumer · 2017-07-06T14:36:53Z

This is a good example why something like a line model would be helpful: #22065 (comment).

After looking into the issue it looks like that pester constantly 'repaints' the whole screen hence the terminal sending the same line onData over and over again. For me it is very hard to tell whether I have seen that line and whether I have processed it already (the same content). What would be cool if the terminal could provide a line model where is line has an id (number) and changes are signaled on that model instead of sending raw data.

Tyriar · 2017-07-06T17:45:59Z

@dbaeumer I actually added an ID concept to the buffer when doing the selection stuff (xtermjs/xterm.js@65256c8) but I ended up not needing it and pulling it out, instead the selection listens to when the buffer is trimmed and adjusts the indexes of its own model.

It really didn't feel very good adding it as it introduced much more complexity to the already complex circular list and added the overhead of an addition object for every entry - which is a lot when you need to create tens of thousands of these things in an instant. There were other issues that come up with this like IDs being interleaved due to programs having the ability to delete/insert lines so there's no way I could think of to optimize it nicely, leading to a slower, more memory hungry terminal.

It would probably be easier and more performant to just deduplicate these errors on your end. For example you could record a hash for current problems and check it before adding new problems, or perhaps even better, record the raw terminal line of the error so you don't need to parse it again for duplicate problems.

dbaeumer · 2017-07-07T09:48:14Z

I implemented the deduplicate yesterday but it feels like a hack on my side :-) I need to create a unique key for something that doesn't necessarily have one. I still think having a line bassed model comparable to what the editor exposes would be a nice thing. Me giving the raw data is comparable to the editor signaling raw cursor moves and text changes. Instead I get a nice model with text edit events. And I am not interested in color information like I am not interested in this on an editor buffer.

dbaeumer · 2017-07-07T09:49:56Z

Just for the record. I can not filter on the line itself from the terminal without knowing if I have parsed that line already. Matchers are multiline aware and there could be two equal lines in one output on purpose since they difference is made by a line above. Me knowing this is implementing a line model on top of it :-)

Tyriar · 2017-07-07T15:30:53Z

@dbaeumer isn't the point of programs rewriting lines that the lines do change and that they may need to be reparsed?

Here's one idea that might work for both of us: exposing some data stream you can listen to as in the original comment, but only ever pushing data if it's on the last line in the terminal. Would this cover your use case? For the PS case #22065 (comment) you would get multiple lines for the last line but maybe this could be more clever somehow in recording when the cursor moves and only reporting the line a single time (not by using IDs but by only reporting once after a lineFeed call inserts a new line.

dbaeumer · 2017-07-10T07:02:41Z

@Tyriar yes, it would be cool to be able to reparse a line when it changes. As I tried to outline in #29840 (comment) this needs to have some sort of model to do it right. I can't simply decide things on the content of a line. So instead me implementing a line model on top of ANSI control sequence characters I was somehow hoping that the terminal could expose its. I my simple mind the terminal works like this: program -> Output with control characters -> line model -> view model (does the wrapping, could do folding, ...) -> DOM.

I think the last line suggestion will work for parsing problems. However when exposing this as API in the extension host I think it might be a little to limited.

Tyriar · 2017-07-11T18:58:52Z

@dbaeumer currently there is only the line model, no view model. I did a big write up on introducing one to properly support reflowing lines here xtermjs/xterm.js#644 (comment)

This has been a pretty long discussion across a few issues but to clarify where we are now, here are my concerns:

Extra memory overhead in the buffer: I'm currently in the process of trying to push down the terminal's memory usage as I need to store more data against each character and memory usage right now is approaching unacceptable limits imo Terminal does not render true color xtermjs/xterm.js#484 (comment)

Here is a snapshot of a 160x24 terminal with 5000 scrollback filled which is not out of the ordinary, I typically have two of these running just in the window with Microsoft/vscode opened:
Extra processing when data/lines are added to the buffer: back in December I made a big effort improving the terminal's performance Improve xterm.js performance #17875 by attacking it in a number of ways. The main take aways from this experience was that trying to write a high performance terminal in JavaScript/DOM is very hard because processing and displaying data is so expensive. I can now safely say that xterm.js is the fastest JS terminal out there by a long shot and I want us to maintain this status. A proper view model that can track lines that changes would amount to a significant amount of processing in the worst case which would be generally unnecessary for other use cases.

Some possible solutions:

Introduce a new event that you can listen to that fires when a line feed is added to the last line, ie. only fire an event once per line and ignore lines being updated or lines being inserted.

Pros: It doesn't involve storing any more data or doing anymore processing, other that pulling the string from the buffer.
Cons: it probably isn't sufficient to cover all use cases.
Add an ID to each buffer line (xtermjs/xterm.js@65256c8) and fire an event when there is a line feed against a line or the cursor moves away from it (specifics here would need to be figured out).

Pros: It's a more complete solution and covers more use cases. We need an addition JS object and number associated with each line, this probably doesn't add up to that much data as it's on the line rather than the char.
Cons: Additional complexity within the CircularList data structure. I'm not so sure just yet when exactly we should fire these events. The IDs will be out of order in some cases which may be confusing to consumers. These events may still fire when the line did not change (eg. if the program/script erased and rewrote the line).
Introduce a proper view model that throws events only when a line changes.

Pros: I think this is the ideal solution from your perspective.
Cons: All the cons of number 2. The amount of string comparisons needed here is out of the question in the worst case with my current thinking, unless this was disabled by default and generally did not run on a lot of data.

Commentary on the solutions:

It doesn't look like this is sufficient for problem matchers
This could work, but I'm not sure it helps much with the issue in Help users to migrate to new tasks.json version 2.0.0. #22065 (comment)
This behind an option on terminal initialization is probably the best way to go, but it's also the most difficult and time consuming to implement. Work on this is blocked on the following active PRs: Support truecolor xtermjs/xterm.js#756 & Introduce Buffer and BufferSet classes xtermjs/xterm.js#717, and it should be done in a way that accommodates future line reflowing Add logic for reflowing lines xtermjs/xterm.js#644 (comment)

dbaeumer · 2017-07-12T07:29:59Z

@Tyriar thanks a lot for the detailed explanation. Having solution 2 will for sure make my life easier and will not make things worse on my end. Regarding option 3: I am not interested in the view model since I need lines unwrapped. The line model is what would work best for me assuming that the line model doesn't have any \n in there to make lines visually wrap .

Tyriar · 2017-07-12T17:16:27Z

@dbaeumer option 3 including reflow xtermjs/xterm.js#644 (comment) is required in order to get unwrapped lines

Not sure when I'll have time to work on this, pretty swamped atm. It might be a good opportunity to collaborate if this is high priority for you?

dbaeumer · 2017-07-13T09:41:32Z

@Tyriar I will be out on vacation until 2. August. I think we need to address this sooner than later since we can't assume that problems always fit onto one line.

An interims solution could be as follows: can we introduce a custom control sequence (e.g. a operating system control sequence) that describes the new lines inserted to wrap the text. I could then filter them on my end since I would be able to distinguish them from normal newline. Would this be something you can add in July then I would catchup with it for the July build ?

Tyriar · 2017-07-13T17:24:29Z

@dbaeumer I actually expect you to get the unwrapped lines currently, xterm.js just wraps the lines when they are going to overflow https://github.com/sourcelair/xterm.js/blob/2ebc926f4e438c77c4da702b14f82f46424120a8/src/InputHandler.ts#L55, that's the only wrapping it knows about.

On Windows though I'm not sure rows are ever marked to be wrapped (not 100% sure), at least on PowerShell. This is just how winpty works.

Fixes #29840

Part of #32042 Part of #29840

vscodebot bot added the api label Jun 29, 2017

dbaeumer added terminal General terminal issues that don't fall under another label feature-request Request for new features or functionality and removed api labels Jun 29, 2017

dbaeumer assigned Tyriar Jun 29, 2017

dbaeumer mentioned this issue Jun 29, 2017

Node 7 sends ASCII control characters not handled by the TerminalDecoder #29606

Closed

Tyriar added the upstream Issue identified as 'upstream' component related (exists outside of VS Code) label Jun 29, 2017

Tyriar added this to the Backlog milestone Jun 29, 2017

dbaeumer mentioned this issue Jul 6, 2017

Help users to migrate to new tasks.json version 2.0.0. #22065

Closed

This was referenced Jul 12, 2017

Problem matchers can't match correctlly if output line is longer than width of Integrated Terminal #30550

Closed

Buffer performance improvements xtermjs/xterm.js#791

Closed

roblourens mentioned this issue Aug 5, 2017

Wrapped lines lead to wrong problems #32042

Closed

Tyriar modified the milestones: Backlog, October 2017 Oct 26, 2017

Tyriar removed the upstream Issue identified as 'upstream' component related (exists outside of VS Code) label Oct 26, 2017

Tyriar added a commit that referenced this issue Oct 26, 2017

Provide an ITerminalInstance.onLineData internal API

8bf87ab

Fixes #29840

Tyriar mentioned this issue Oct 26, 2017

Provide an ITerminalInstance.onLineData internal API #36999

Merged

Tyriar closed this as completed in #36999 Oct 26, 2017

Tyriar added a commit that referenced this issue Oct 27, 2017

Send last line terminal data when process exits

7b5af04

Part of #32042 Part of #29840

Tyriar added the debt Code quality issues label Oct 30, 2017

egamma pushed a commit that referenced this issue Oct 31, 2017

Send last line terminal data when process exits

9b656f3

Part of #32042 Part of #29840

Tyriar removed the feature-request Request for new features or functionality label Nov 1, 2017

vscodebot bot locked and limited conversation to collaborators Dec 10, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Terminal should provide API to notify data cleaned by control sequences #29840

Terminal should provide API to notify data cleaned by control sequences #29840

dbaeumer commented Jun 29, 2017

Tyriar commented Jun 29, 2017

dbaeumer commented Jun 30, 2017

Tyriar commented Jun 30, 2017

dbaeumer commented Jul 6, 2017

Tyriar commented Jul 6, 2017

dbaeumer commented Jul 7, 2017

dbaeumer commented Jul 7, 2017

Tyriar commented Jul 7, 2017

dbaeumer commented Jul 10, 2017

Tyriar commented Jul 11, 2017

dbaeumer commented Jul 12, 2017 •

edited

Loading

Tyriar commented Jul 12, 2017

dbaeumer commented Jul 13, 2017

Tyriar commented Jul 13, 2017 •

edited

Loading

Terminal should provide API to notify data cleaned by control sequences #29840

Terminal should provide API to notify data cleaned by control sequences #29840

Comments

dbaeumer commented Jun 29, 2017

Tyriar commented Jun 29, 2017

dbaeumer commented Jun 30, 2017

Tyriar commented Jun 30, 2017

dbaeumer commented Jul 6, 2017

Tyriar commented Jul 6, 2017

dbaeumer commented Jul 7, 2017

dbaeumer commented Jul 7, 2017

Tyriar commented Jul 7, 2017

dbaeumer commented Jul 10, 2017

Tyriar commented Jul 11, 2017

dbaeumer commented Jul 12, 2017 • edited Loading

Tyriar commented Jul 12, 2017

dbaeumer commented Jul 13, 2017

Tyriar commented Jul 13, 2017 • edited Loading

dbaeumer commented Jul 12, 2017 •

edited

Loading

Tyriar commented Jul 13, 2017 •

edited

Loading