Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec for Windows Terminal Process Model 2.0 #7240

Merged
merged 32 commits into from
Feb 5, 2021
Merged

Conversation

zadjii-msft
Copy link
Member

@zadjii-msft zadjii-msft commented Aug 10, 2020

doc link

Summary of the Pull Request

This spec is exceptionally long, and is currently a work in progress. There are a few more things I'd like to have experimentally verified (though, I'm fairly certain they will work, with the right combination of flags and such). Additionally, a few sections have remaining TODOs before the spec is finished. However, this spec is already fairly long, and I want to give people as much time to get their eyes on it as possible.

Abstract

The Windows Terminal currently exists as a single process per window, with one
connection per terminal pane (which could be an additional conpty process and
associated client processes). This model has proven effective for the simple
windowing we've done so far. However, in order to support scenarios like
dragging tabs into other windows, or having one top-level window with different
elevation levels within it, this single process model will not be sufficient.

This spec outlines changes to the Terminal process model in order to enable the
following scenarios:

PR Checklist

Detailed Description of the Pull Request / Additional comments

*** read the spec ***

@zadjii-msft zadjii-msft added Issue-Docs It's a documentation issue that really should be on MicrosoftDocs/Console-Docs Product-Terminal The new Windows Terminal. labels Aug 10, 2020
@zadjii-msft zadjii-msft self-assigned this Aug 10, 2020
Copy link
Member

@carlos-zamora carlos-zamora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to read through the Monarch/Servant Scenarios and on. That'll be a good read for me tomorrow morning with my coffee ☕

Comment on lines 258 to 266
We should probably have a discussion about what happens when the last elevated
content process in an elevated window process is closed. If an elevated window
process doesn't have any remaining elevated content processes, then it should be
able to revert to an unelevated window process. Should we do this? There's
likely some visual flickering that will happen as we re-create the new window
process, so maybe it would be unappealing to users. However, there's also the
negative side effect that come from elevated windows (the aforementioned lack of
drag/drop), so maybe users will want to return to the more permissive unelevated
window process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should switch to an unelevated window process after closing the last elevated content process. The visual flickering is kinda expected, in my mind, because you're changing a universal state of the app.

Copy link
Member

@carlos-zamora carlos-zamora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Monarch/Servant Assignment

In web browsers, I think the "monarch" is the last window that was in use. Because when I open a hyperlink from teams or some non-web-browser, it opens it in a new tab in my last used window. Should we follow suit and make the monarch title move to the last touched window?

Cross Process Communication

This Process Model opens the possibility of cross process communication. I know tmux does a lot of this, but unfortunately I'm not too familiar with it. Would this enable any of those scenarios? If so, which ones and what would that look like.

Comment on lines +488 to +489
Considering all the changes proposed above, we can also support defterm with the
following mechanism:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create a diagram for this section? It would be a bit easier to follow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah shoot, this was the last scenario that I added to the spec, only because Michael had a working proof-of-concept. I didn't really want to get into the weeds on this one, but I guess here I go...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, idk if I've just been reading a bit too long or something, but a diagram would definitely help here. Excited to hear you explain this in the spec meeting though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh, since you mention that you want to leave this section intentionally blank, don't worry about the diagram anymore.

@zadjii-msft
Copy link
Member Author

In web browsers, I think the "monarch" is the last window that was in use. Because when I open a hyperlink from teams or some non-web-browser, it opens it in a new tab in my last used window. Should we follow suit and make the monarch title move to the last touched window?

So, maybe I need to make that more clear. The monarch is just the one with the responsibility of keeping track of everyone else, and answering questions. I don't think we need to shuffle that responsibility each time a window is focused. The monarch can just track "window 3 was the most recently focused". The monarch will also be responsible for tracking the "the user wants new tabs to open in the most recent window" state. So when a new wt starts, it can use that information to either open the new tab in an existing window, or let the new wt spawn its own window.


I'll add these comments to the spec to add more examples to help understand.

Co-authored-by: Carlos Zamora <carlos.zamora@microsoft.com>
Comment on lines 188 to 195
We're going to be using that system a little differently here. Instead of using
a GUID to represent a single _Class_, we're going to use the GUID to uniquely
identify _content processes_. Each content process will receive a unique GUID
when it is created, and it will register as the server for that GUID. Then, any
window process will be able to connect to that specific content process strictly
by GUID. Because each GUID is unique to each content process, any time any
client calls `create_instance<>(theGuid, ...)`, it will uniquely attempt to
connect to the content process hosting `theGuid`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how to feel about this. I can't name another example of something generating COM class IDs on the fly to represent content processes. Is there a source in browser land or some other multi-process content/frame management code that does this already?

If we're the first ones to do it, it makes me think that we're "holding it wrong" and there's an existing provision somewhere that is better for this than camping the COM catalog.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using COM for this is probably unprecedented, which is why I'm gonna summon some more senior architects to take a look at this and tell me if I'm being batty.

Just from examining commandlines associated with edgium and firefox, the idea of passing a unique ID on the commandline doesn't seem totally farfetched:

image
image

I'd presume they're just using a pipe or something for the communication, and I wanted something that might be more robust. To quote someone wiser than me:

My initial prototype here used message-passing type pipes with a custom rolled protocol. If I make my own protocol, it needs to be fuzzed. And I'm probably missing something. Many/most of these concerns for security are probably eliminated if we use a well-known mechanism for this sort of IPC. My thoughts go to a COM server. More complicated to implement than message pipes, but probably brings a lot of security benefits and eliminates the need to fuzz the protocol (probably).

@miniksa miniksa mentioned this pull request Aug 26, 2020
4 tasks
WT window are elevated.

Furthermore, we'll want to ensure that there's nothing that an unelevated client
process could do to trigger any sort of input callback in the parent window. If

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, we'll want to ensure that there's nothing that an unelevated client
process could do to trigger any sort of input callback in the parent window.

How can this be realistically enforced? (Beyond just fixing holes when someone steps on them)
How do you protect against extensions introducing these vectors?
How do you remind every developer 10 years forward to keep these rules?
The point behind elevation is to have this hard separation.
The mixed elevation idea is just saying "We'll implement a backdoor, but don't worry we'll keep it secure".
This will end up with another global opt-out toggle in GPO.
Just don't build a backdoor, it's perfectly reasonable to have a fully separated stack for everything elevated.

@zadjii-msft
Copy link
Member Author

showerthought (to make sure I don't forget): extensions should be disabled by default in elevated windows. The user can chose to enable them in elevated windows if they want. That setting needs to be hidden in a file that only admins can write to.

I suppose some extensions might be fine to run in unelevated content processes, but would probably be impossible to entirely prevent them from triggering code in the parent process.

Like an extension running in the content process could try to trigger the "enable broadcast input" mode and then have its keystrokes sent to the elevated content procs Actually, maybe not. The content process doesn't actually handle any keybindings, does it? That's all in the window process, so that's not terribly a concern.

@@ -668,19 +777,153 @@ spawning of new terminal instances, due to requiring cross-process hops for the
instantiation of the content process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, no.... I actually DO expect a relatively dramatic change here. We now have additional memory overhead of standing up all the extra processes, threads, heaps.... we have additional computational and memory overhead of all of the marshaling. This will likely consume more battery/energy than it did before. You also lose all linker optimizations/folding across some of these boundaries because they're RPCs...

This will likely hurt to some degree across the board. We're just going to have to be confident that....1. it's worth it to enable this behavior 2. we'll be able to grind out some of the performance issues to make them better over time

Copy link
Member

@miniksa miniksa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good enough. Let's do it and refine from here.

Comment on lines +217 to +220
Instead, the window will always need to be the one responsible for calling
`DuplicateHandle`. Fortunately, the `DuplicateHandle` function does allow a
caller to duplicate from another process into your own process.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idle thoughts -- probably covered in live spec review. I wonder if we should have a MIDL2 interface that lets us use [niksa's recent midl work] somewhere? once we get all the pieces in place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think we're going to end up with a split COM/WinRT interface for this. COM to do the niksa thing, WinRT to not want to blow my brains out doing everything else

Copy link
Member

@DHowett DHowett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad some consideration has been paid to keeping TermControl and TermControlCore in-proc.

I don't love that there's TermControl, TermControlCore, and TermCore.

This is the best-authored spec I have ever read. Thank you. Sorry it took half a year to review!

@zadjii-msft zadjii-msft added the AutoMerge Marked for automatic merge by the bot when requirements are met label Feb 5, 2021
@ghost
Copy link

ghost commented Feb 5, 2021

Hello @zadjii-msft!

Because this pull request has the AutoMerge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@github-actions
Copy link

github-actions bot commented Feb 5, 2021

Misspellings found, please review:

  • QOL
To accept these changes, run the following commands from this repository on this branch
pushd $(git rev-parse --show-toplevel)
perl -e '
my @expect_files=qw('".github/actions/spelling/expect/alphabet.txt
.github/actions/spelling/expect/expect.txt
.github/actions/spelling/expect/web.txt"');
@ARGV=@expect_files;
my @stale=qw('"aspnet boostorg dahall fde fea fmtlib isocpp mintty NVDA pinam unte vcrt xamarin "');
my $re=join "|", @stale;
my $suffix=".".time();
my $previous="";
sub maybe_unlink { unlink($_[0]) if $_[0]; }
while (<>) {
  if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; }
  next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print;
}; maybe_unlink($previous);'
perl -e '
my $new_expect_file=".github/actions/spelling/expect/9cb8db8e9a03eb0d46bf3206de5067e60c8fc1b1.txt";
use File::Path qw(make_path);
make_path ".github/actions/spelling/expect";
open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE;
my @add=qw('"QOL VCRT Xamarin "');
my %items; @items{@words} = @words x (1); @items{@add} = @add x (1);
@words = sort {lc($a) cmp lc($b)} keys %items;
open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; };
close FILE;'
popd
✏️ Contributor please read this

By default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.

⚠️ The command is written for posix shells. You can copy the contents of each perl command excluding the outer ' marks and dropping any '"/"' quotation mark pairs into a file and then run perl file.pl from the root of the repository to run the code. Alternatively, you can manually insert the items...

If the listed items are:

  • ... misspelled, then please correct them instead of using the command.
  • ... names, please add them to .github/actions/spelling/dictionary/names.txt.
  • ... APIs, you can add them to a file in .github/actions/spelling/dictionary/.
  • ... just things you're using, please add them to an appropriate file in .github/actions/spelling/expect/.
  • ... tokens you only need in one place and shouldn't generally be used, you can add an item in an appropriate file in .github/actions/spelling/patterns/.

See the README.md in each directory for more information.

🔬 You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. 😉

🗜️ If you see a bunch of garbage and it relates to a binary-ish string, please add a file path to the .github/actions/spelling/excludes.txt file instead of just accepting the garbage.

File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

@zadjii-msft zadjii-msft merged commit 4cce933 into main Feb 5, 2021
@zadjii-msft zadjii-msft deleted the dev/migrie/s/5000 branch February 5, 2021 12:19
zadjii-msft added a commit that referenced this pull request Feb 5, 2021
### ⇒ [doc link](https://github.com/microsoft/terminal/blob/dev/migrie/s/4472-window-management/doc/specs/%235000%20-%20Process%20Model%202.0/%234472%20-%20Windows%20Terminal%20Session%20Management.md) ⇐

## Summary of the Pull Request

This is a more detailed spec for two parts of the "Process Model 2.0" work that's being tracked in #5000. In particular, this spec focuses on the management of Windows Terminal windows, including opening new tabs in existing windows. 

Largely, the reader is expected to have already read the spec in progress in #7240, and already be familiar with the concept of "Monarch" and "Peasant" windows as introduced by that spec. For that reason, ⚠ **THIS PR IS TARGETING THE BRANCH FOR #7240** ⚠. 

### Abstract

> This document is intended to serve as an addition to the [Process Model 2.0
> Spec]. That document provides a big-picture overview of changes to the entirety
> of the Windows Terminal process architecture, including both the split of
> window/content processes, as well as the introduction of monarch/peasant
> processes. The focus of that document was to identify solutions to a set of
> scenarios that were closely intertwined, and establish these solutions would
> work together, without preventing any one scenario from working. What that
> document did not do was prescribe specific solutions to the given scenarios.
>
> This document offers a deeper dive on a subset of the issues in [#5000], to
> describe specifics for managing multiple windows with the Windows Terminal. This
> includes features such as:
>
> * Run `wt` in the current window ([#4472])
> * Single Instance Mode ([#2227])


## PR Checklist
* [x] Specs: #4472, Specs #2227
* [x] References: #5000, #4472, #2227, #7240
* [x] I work here

## Detailed Description of the Pull Request / Additional comments
_\*<sup>\*</sup><sub>\*</sub> read the spec  <sub>\*</sub><sup>\*</sup>\*_

### Why are these two separate documents?

I felt that the spec that is currently in review in #7240 and this doc should remain separate, yet closely related documents. #7240 is more about showing how this large set of problems discussed in #5000 can all be solved technically, and how those solutions can be used together. It establishes that none of the proposed solutions for components of #5000 will preclude the possibility of other components being solved. What it does _not_ do however is drill too deeply on the user experience that will be built on top of those architectural changes. 

This doc on the other hand focuses more closely on a pair of scenarios, and establishes how those scenarios will work technically, and how they'll be exposed to the user. 

### TODO:

* [x] A thought - How will we handle arguments like `--fullscreen`, `--initialSize r,c`? They only apply when creating a new window, right?
* [x] When a `wt -s 1 split-pane` command is executed, we'll need to make sure to not _also_ create a new tab
@zadjii-msft zadjii-msft mentioned this pull request Feb 24, 2021
3 tasks
@fowl2
Copy link

fowl2 commented Apr 19, 2021

<3 <3 <3 all this stuff being out in the open! Terminal Team, you are doing great!

Hopefully I didn't just misread the spec, but is there is a explanation for the 1:1 relationship between windows and window processes? Browsers have one process hosting all their windows for example.

Multiple content processes brings reliability and other advantages, which is great, but seems pretty orthogonal to having multiple window processes? This business of attaching and re-attaching TermControls (live terminals) between window processes sounds riskier than "just" sending startup requests to me.

(PS. the doclink at the top is broken now that the branch has been deleted, in main it is here)

@zadjii-msft
Copy link
Member Author

is there is a explanation for the 1:1 relationship between windows and window processes

Oh, no I probably skipped that. I guess I just left that be an assumption made by the reader. I tried having multiple windows in the same process during the experimentation phase of this spec, but BOY was XAML unhappy about that. It's probably possible, but it seems more fragile to have all the windows in the same top-level process. Plus, the XAML objects still can't be shared across threads, so even if we did the hard work of having all the windows in the same top-level process, we still wouldn't be able to just toss the TermControls across the thread boundary at another window in the process.

ghost pushed a commit that referenced this pull request Apr 21, 2021
### ⇒ [doc link](https://github.com/microsoft/terminal/blob/dev/migrie/s/653-quake-mode/doc/specs/%23653%20-%20Quake%20Mode/%23653%20-%20Quake%20Mode.md) ⇐

## Summary of the Pull Request

After reading through 114+ comments in #653 and related issues, I think I've finally wrapped my head around all the possible scenarios for quake mode. <!-- Speak now or forever hold your peace. --> This also includes "minimize to tray", because the two are a powerful combination. With the work already prototyped in [`dev/migrie/f/653-QUAKE-MODE`](https://github.com/microsoft/terminal/tree/dev/migrie/f/653-QUAKE-MODE), [I'm starting to believe](https://j.gifs.com/58vKNx.gif) that we could actually land this in 2.0.


### Abstract

> Many existing terminals support a feature whereby a user can press a keybinding
> anywhere in the OS, and summon their terminal application. Oftentimes the act of
> summoning this window is accompanied by a "dropdown" animation, where the window
> slides in to view from the top of the screen. This global summon action is often
> referred to as "quake mode", a reference to the videogame Quake who's console
> slid in from the top.
> 
> This spec addresses both of the following two issues:
> * "Quake Mode" ([#653])
> * "Minimize to tray" ([#5727])


## PR Checklist
* [x] Specs: #653, #5727
* [x] References: #5000, #4472, #2227, #7240, #8135
* [x] I work here

## Detailed Description of the Pull Request / Additional comments
_\*<sup>\*</sup><sub>\*</sub> read the spec  <sub>\*</sub><sup>\*</sup>\*_
@SuJiKiNen SuJiKiNen mentioned this pull request Apr 23, 2021
zadjii-msft added a commit that referenced this pull request Aug 25, 2021
### ⇒ [doc link](https://github.com/microsoft/terminal/blob/dev/migrie/s/1032-elevation-qol/doc/specs/%235000%20-%20Process%20Model%202.0/%231032%20-%20Elevation%20Quality%20of%20Life%20Improvements.md) ⇐


## Summary of the Pull Request

Despite my best efforts to mix elevation levels in a single Terminal window, it seems that there's no way to do that safely. With the dream of mixed elevation dead, this spec outlines a number of quality-of-life improvements we can make to the Terminal today. These should make using the terminal in elevated scenarios better, since we can't have M/E.

### Abstract

> For a long time, we've been researching adding support to the Windows Terminal
> for running both unelevated and elevated (admin) tabs side-by-side, in the same
> window. However, after much research, we've determined that there isn't a safe
> way to do this without opening the Terminal up as a potential
> escalation-of-privilege vector.
> 
> Instead, we'll be adding a number of features to the Terminal to improve the
> user experience of working in elevated scenarios. These improvements include:
> 
> * A visible indicator that the Terminal window is elevated ([#1939])
> * Configuring the Terminal to always run elevated ([#632])
> * Configuring a specific profile to always open elevated ([#632])
> * Allowing new tabs, panes to be opened elevated directly from an unelevated
>   window
> * Dynamic profile appearance that changes depending on if the Terminal is
>   elevated or not. ([#1939], [#8311])


## PR Checklist
* [x] Specs: #1032, #632
* [x] References: #5000, #4472, #2227, #7240, #8135, #8311
* [x] I work here

## Detailed Description of the Pull Request / Additional comments
_\*<sup>\*</sup><sub>\*</sub> read the spec  <sub>\*</sub><sup>\*</sup>\*_

### Why are these two separate documents?

I felt that the spec that is currently in review in #7240 and this doc should remain separate, yet closely related documents. #7240 is more about showing how this large set of problems discussed in #5000 can all be solved technically, and how those solutions can be used together. It establishes that none of the proposed solutions for components of #5000 will preclude the possibility of other components being solved. What it does _not_ do however is drill too deeply on the user experience that will be built on top of those architectural changes. 

This doc on the other hand focuses more closely on a pair of scenarios, and establishes how those scenarios will work technically, and how they'll be exposed to the user.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AutoMerge Marked for automatic merge by the bot when requirements are met Issue-Docs It's a documentation issue that really should be on MicrosoftDocs/Console-Docs Product-Terminal The new Windows Terminal.
Projects
None yet
Development

Successfully merging this pull request may close these issues.