-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DRAFT] add til::clump<T>, some weird data structure #7220
Conversation
I have some comments on this, based on on some refactoring work I was experimenting with for the parameter handling in general. I had to put that refactoring on hold to get the SGR refactoring out the way first (and that led my down the rabbit hole of the conpty colors), but at some point I would like to get back to it. To give you a bit of context, my goal was to try and simplify the way parameters were accessed in the The idea was that the parms class would automatically handle any missing or default values, and eliminate the need for literally hundreds of lines of those boilerplate parameter extraction methods. The plan was also for this class to support looping over sequences of selective parameters, so we could more easily address issues like #2101. I hadn't got to the point of dealing with the colon subparameters, but I was thinking about how they might eventually be incorporated, and I came to the following conclusions:
Given those constraints, I think this I should also mention some thoughts I had on the underlying structure. One of my ideas was that we could reserve a bit in each parameter value to signal whether it was the start of a sub sequence or not, and that way we could avoid the need for two lists. I don't know whether that was a good idea - just throwing it out there. And one other thing. I was thinking at some point that we should make the parameter list a fixed length. The DEC standard recommends supporting up to 16, and most terminal emulators only support around 32. Allowing unlimited parameters sounds nice in theory, but it's a potential attack vector for overflowing memory. Not hugely important, but something to consider while we're evaluating data structures for this. |
That's an interesting take, and yeah -- I think it would be great to have something that supports specifically VT parsing with all its optional parameters. We're definitely overdue for a refactor there, and you're right that we could evict a good amount of boilerplate with a better type. The guiding principles behind
I don't necessarily love that this is a too-general solution applied to a too-specific problem, but I do appreciate how minimal the delta is in StateMachine/the engines 😄 Incidentally, because of the decay of a list of length=1 units into a set of spans of length=1 I'd updated the SGR dispatcher to operate on a single SGR parameter pack at a time. It seems like the right thing to do, but it does rather involve (I apparently left while writing this and came back and never finished it. I haven't the foggiest idea what I was going for.) Perhaps there's a staged approach?
|
Maybe I need to wait to see how it's going to be used, but I got the impression that the index would be returning a span. And that would mean that everywhere we're currently doing something like We'd also need to check the size of the returned span and fail for values greater than one. And in the case of parameter sequences it's even more complicated (although perhaps that's an edge case that could initially be ignored). The bottom line is that it doesn't sound like zero additional cost, unless I've misunderstood something.
Now that's something I hadn't considered. I think that's a good enough argument to convince me it's worth going ahead with this regardless of any other concerns I might have.
This is the bit that I have my doubts about, but I'm happy to wait and see how it turns out.
Yeah, there's no need to hold up anything for the refactoring - I just wanted to let you know what I had planned and make sure we were on the same page. And I wouldn't worry about type aliases either. I'd just go right ahead with integrating the clustered_vector into the StateMachine and see what it takes to make the extended SGR colors work (assuming you haven't already done that). |
Oh, yeah, that was a tongue-in-cheek comment at the expense of my own code. I don't think that the thing I did was reasonable (edit: but it was expedient for the purposes of prototyping). Playing with xterm a bit, it looks like sequences that do not expect subparameters are rejected when they have subparameters. That does complicate things a bit, since (as you rightly assert) we'd need to introduce length checks. Have to do a bit more investigation to see whether I'm comfortable committing to this. It might be better-suited for DWrite than VT parsing, and I'm treating it as a hammer for any nail-shaped object. |
Clump isn't currently the right solution. |
Summary of the Pull Request
clump is a vector intended to be consumed in chunks.
It is stored as a vector of T with an optional vector
of lengths as a sidecar.
If the length vector is missing, it is assumed that
each component is of length 1.
During iteration, this clump will produce three spans:
During iteration, this clump will produce four spans:
It's designed like this to be approximately as performant as a vector in the normal case (where there are no sizes specified).
TODO:
References
PR Checklist
Detailed Description of the Pull Request / Additional comments
Validation Steps Performed