Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terminal should provide API to notify data cleaned by control sequences #29840

Closed
dbaeumer opened this issue Jun 29, 2017 · 14 comments · Fixed by #36999
Closed

Terminal should provide API to notify data cleaned by control sequences #29840

dbaeumer opened this issue Jun 29, 2017 · 14 comments · Fixed by #36999
Assignees
Labels
debt Code quality issues terminal Integrated terminal issues
Milestone

Comments

@dbaeumer
Copy link
Member

I noticed that the terminal no sends Operating System Commands (OSC) like setting the window title. Right now the task runner filters them but this gets more and more complex and I am actually not an expert with this. So I would like to request that the terminal provides API to notify data without control sequences. Currently I used these two regular expressions to filter them:

	private static ANSI_CONTROL_SEQUENCE: RegExp = /\x1b[[()#;?]*(?:\d{1,4}(?:;\d{0,4})*)?[0-9A-ORZcf-nqry=><]/g;
	private static OPERATING_SYSTEM_COMMAND_SEQUENCE: RegExp = /\x1b[\]](?:.*)(?:\x07|\x1b\\)/g;
@vscodebot vscodebot bot added the api label Jun 29, 2017
@dbaeumer dbaeumer added terminal Integrated terminal issues feature-request Request for new features or functionality and removed api labels Jun 29, 2017
@Tyriar
Copy link
Member

Tyriar commented Jun 29, 2017

There are two unknowns to me right now with this:

  • Various control sequences can do a whole lot of stuff, so where is the best place to gather this data? It might be worth seeing if adding listeners to addChar and lineFeed will give what you need.
  • How do this efficiently. Because a lot of data can stream through the terminal, we should probably only enable this when it's explicitly requested (ie. only fire the event if something is listening). It should probably also be batched.

PR welcome upstream.

@Tyriar Tyriar added the upstream Issue identified as 'upstream' component related (exists outside of VS Code) label Jun 29, 2017
@Tyriar Tyriar added this to the Backlog milestone Jun 29, 2017
@dbaeumer
Copy link
Member Author

From a problem matcher point of view it would be enough sending the line. We could sent it when a linefeed is added. Then there is still the problem of a line content changing. We once discussed the idea of a line identifier to signal that. I agree that sending single characters might not be a smart idea in terms of performance.

@Tyriar
Copy link
Member

Tyriar commented Jun 30, 2017

If it's just on line feed then many characters could go missing, eg. "abc" "def" would result in "def" only. Maybe this is not something we care about, but it could lead to some obscure bugs if people relied on this new API.

@dbaeumer
Copy link
Member Author

dbaeumer commented Jul 6, 2017

This is a good example why something like a line model would be helpful: #22065 (comment).

After looking into the issue it looks like that pester constantly 'repaints' the whole screen hence the terminal sending the same line onData over and over again. For me it is very hard to tell whether I have seen that line and whether I have processed it already (the same content). What would be cool if the terminal could provide a line model where is line has an id (number) and changes are signaled on that model instead of sending raw data.

@Tyriar
Copy link
Member

Tyriar commented Jul 6, 2017

@dbaeumer I actually added an ID concept to the buffer when doing the selection stuff (xtermjs/xterm.js@65256c8) but I ended up not needing it and pulling it out, instead the selection listens to when the buffer is trimmed and adjusts the indexes of its own model.

It really didn't feel very good adding it as it introduced much more complexity to the already complex circular list and added the overhead of an addition object for every entry - which is a lot when you need to create tens of thousands of these things in an instant. There were other issues that come up with this like IDs being interleaved due to programs having the ability to delete/insert lines so there's no way I could think of to optimize it nicely, leading to a slower, more memory hungry terminal.

It would probably be easier and more performant to just deduplicate these errors on your end. For example you could record a hash for current problems and check it before adding new problems, or perhaps even better, record the raw terminal line of the error so you don't need to parse it again for duplicate problems.

@dbaeumer
Copy link
Member Author

dbaeumer commented Jul 7, 2017

I implemented the deduplicate yesterday but it feels like a hack on my side :-) I need to create a unique key for something that doesn't necessarily have one. I still think having a line bassed model comparable to what the editor exposes would be a nice thing. Me giving the raw data is comparable to the editor signaling raw cursor moves and text changes. Instead I get a nice model with text edit events. And I am not interested in color information like I am not interested in this on an editor buffer.

@dbaeumer
Copy link
Member Author

dbaeumer commented Jul 7, 2017

Just for the record. I can not filter on the line itself from the terminal without knowing if I have parsed that line already. Matchers are multiline aware and there could be two equal lines in one output on purpose since they difference is made by a line above. Me knowing this is implementing a line model on top of it :-)

@Tyriar
Copy link
Member

Tyriar commented Jul 7, 2017

@dbaeumer isn't the point of programs rewriting lines that the lines do change and that they may need to be reparsed?

Here's one idea that might work for both of us: exposing some data stream you can listen to as in the original comment, but only ever pushing data if it's on the last line in the terminal. Would this cover your use case? For the PS case #22065 (comment) you would get multiple lines for the last line but maybe this could be more clever somehow in recording when the cursor moves and only reporting the line a single time (not by using IDs but by only reporting once after a lineFeed call inserts a new line.

@dbaeumer
Copy link
Member Author

@Tyriar yes, it would be cool to be able to reparse a line when it changes. As I tried to outline in #29840 (comment) this needs to have some sort of model to do it right. I can't simply decide things on the content of a line. So instead me implementing a line model on top of ANSI control sequence characters I was somehow hoping that the terminal could expose its. I my simple mind the terminal works like this: program -> Output with control characters -> line model -> view model (does the wrapping, could do folding, ...) -> DOM.

I think the last line suggestion will work for parsing problems. However when exposing this as API in the extension host I think it might be a little to limited.

@Tyriar
Copy link
Member

Tyriar commented Jul 11, 2017

@dbaeumer currently there is only the line model, no view model. I did a big write up on introducing one to properly support reflowing lines here xtermjs/xterm.js#644 (comment)

This has been a pretty long discussion across a few issues but to clarify where we are now, here are my concerns:

  • Extra memory overhead in the buffer: I'm currently in the process of trying to push down the terminal's memory usage as I need to store more data against each character and memory usage right now is approaching unacceptable limits imo Terminal does not render true color  xtermjs/xterm.js#484 (comment)

    Here is a snapshot of a 160x24 terminal with 5000 scrollback filled which is not out of the ordinary, I typically have two of these running just in the window with Microsoft/vscode opened:

    image

  • Extra processing when data/lines are added to the buffer: back in December I made a big effort improving the terminal's performance Improve xterm.js performance #17875 by attacking it in a number of ways. The main take aways from this experience was that trying to write a high performance terminal in JavaScript/DOM is very hard because processing and displaying data is so expensive. I can now safely say that xterm.js is the fastest JS terminal out there by a long shot and I want us to maintain this status. A proper view model that can track lines that changes would amount to a significant amount of processing in the worst case which would be generally unnecessary for other use cases.

Some possible solutions:

  1. Introduce a new event that you can listen to that fires when a line feed is added to the last line, ie. only fire an event once per line and ignore lines being updated or lines being inserted.

    Pros: It doesn't involve storing any more data or doing anymore processing, other that pulling the string from the buffer.
    Cons: it probably isn't sufficient to cover all use cases.

  2. Add an ID to each buffer line (xtermjs/xterm.js@65256c8) and fire an event when there is a line feed against a line or the cursor moves away from it (specifics here would need to be figured out).

    Pros: It's a more complete solution and covers more use cases. We need an addition JS object and number associated with each line, this probably doesn't add up to that much data as it's on the line rather than the char.
    Cons: Additional complexity within the CircularList data structure. I'm not so sure just yet when exactly we should fire these events. The IDs will be out of order in some cases which may be confusing to consumers. These events may still fire when the line did not change (eg. if the program/script erased and rewrote the line).

  3. Introduce a proper view model that throws events only when a line changes.

    Pros: I think this is the ideal solution from your perspective.
    Cons: All the cons of number 2. The amount of string comparisons needed here is out of the question in the worst case with my current thinking, unless this was disabled by default and generally did not run on a lot of data.

Commentary on the solutions:

  1. It doesn't look like this is sufficient for problem matchers
  2. This could work, but I'm not sure it helps much with the issue in Help users to migrate to new tasks.json version 2.0.0. #22065 (comment)
  3. This behind an option on terminal initialization is probably the best way to go, but it's also the most difficult and time consuming to implement. Work on this is blocked on the following active PRs: Support truecolor xtermjs/xterm.js#756 & Introduce Buffer and BufferSet classes xtermjs/xterm.js#717, and it should be done in a way that accommodates future line reflowing Add logic for reflowing lines xtermjs/xterm.js#644 (comment)

@dbaeumer
Copy link
Member Author

dbaeumer commented Jul 12, 2017

@Tyriar thanks a lot for the detailed explanation. Having solution 2 will for sure make my life easier and will not make things worse on my end. Regarding option 3: I am not interested in the view model since I need lines unwrapped. The line model is what would work best for me assuming that the line model doesn't have any \n in there to make lines visually wrap .

@Tyriar
Copy link
Member

Tyriar commented Jul 12, 2017

@dbaeumer option 3 including reflow xtermjs/xterm.js#644 (comment) is required in order to get unwrapped lines

Not sure when I'll have time to work on this, pretty swamped atm. It might be a good opportunity to collaborate if this is high priority for you?

@dbaeumer
Copy link
Member Author

@Tyriar I will be out on vacation until 2. August. I think we need to address this sooner than later since we can't assume that problems always fit onto one line.

An interims solution could be as follows: can we introduce a custom control sequence (e.g. a operating system control sequence) that describes the new lines inserted to wrap the text. I could then filter them on my end since I would be able to distinguish them from normal newline. Would this be something you can add in July then I would catchup with it for the July build ?

@Tyriar
Copy link
Member

Tyriar commented Jul 13, 2017

@dbaeumer I actually expect you to get the unwrapped lines currently, xterm.js just wraps the lines when they are going to overflow https://github.com/sourcelair/xterm.js/blob/2ebc926f4e438c77c4da702b14f82f46424120a8/src/InputHandler.ts#L55, that's the only wrapping it knows about.

On Windows though I'm not sure rows are ever marked to be wrapped (not 100% sure), at least on PowerShell. This is just how winpty works.

@Tyriar Tyriar modified the milestones: Backlog, October 2017 Oct 26, 2017
@Tyriar Tyriar removed the upstream Issue identified as 'upstream' component related (exists outside of VS Code) label Oct 26, 2017
Tyriar added a commit that referenced this issue Oct 27, 2017
@Tyriar Tyriar added the debt Code quality issues label Oct 30, 2017
egamma pushed a commit that referenced this issue Oct 31, 2017
@Tyriar Tyriar removed the feature-request Request for new features or functionality label Nov 1, 2017
@vscodebot vscodebot bot locked and limited conversation to collaborators Dec 10, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
debt Code quality issues terminal Integrated terminal issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants