Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds emphasis to Select-String default formatter #8963

Merged
merged 24 commits into from
Oct 8, 2019

Conversation

derek-xia
Copy link
Contributor

@derek-xia derek-xia commented Feb 24, 2019

PR Summary

Adds the -Emphasize parameter for Select-String and makes it work with -AllMatches and -SimpleMatch.
The -Emphasize parameter highlights the matched text using the negative escape sequence. Issue #2905.
Started at HackIllinois.

PR Context

Makes it easier to distinguish what users are looking for when using the Select-String cmdlet. Helps resolve issue #2905.

PR Checklist

@msftclas
Copy link

msftclas commented Feb 24, 2019

CLA assistant check
All CLA requirements met.

@rkeithhill
Copy link
Collaborator

rkeithhill commented Feb 24, 2019

I would suggest that you check to see if the host supports virtual terminal sequences, and if so, use a VT sequence instead of asterisks. Using any character is a bit problematic because you may have that character in the text your searching. That could confuse the user.

FWIW, this is how grep "highlights" matched text:

image

The tricky part is figuring out what color (or effect e.g. reverse) to use for matched text. And whatever you pick, you probably need to make it configurable.

@TylerLeonhardt
Copy link
Member

Thanks @rkeithhill for the guidance 😊

This is a student at HackIllinois. @rjmholt and I are working with the students here and are reviewing PRs that they send.

All comments are welcome 😊

@rkeithhill
Copy link
Collaborator

FYI I find this VT sequence resource very handy: https://docs.microsoft.com/en-us/windows/console/console-virtual-terminal-sequences

Copy link
Collaborator

@vexx32 vexx32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that highlighting with some form of colour is a more worthwhile approach, as the matched string itself could easily contain asterisks or really any number of other characters that we might try to emphasize it with.

@derek-xia
Copy link
Contributor Author

I made the matched text red on terminals that support VT sequences and retained the asterisks on terminals that don't support VT sequences.

@TylerLeonhardt
Copy link
Member

@derek-xia can you add a screenshot so everyone can see?

@derek-xia
Copy link
Contributor Author

derek-xia commented Feb 24, 2019

screen shot 2019-02-24 at 2 00 20 am

This is a screenshot of -Emphasize with and without -AllMatch

@@ -1518,6 +1600,9 @@ private bool DoMatchWorker(string operandString, MatchInfo matchInfo, out MatchI
int patternIndex = 0;
matchResult = null;

var indexes = new List<int>();
var lengths = new List<int>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we need this if already have the same in matchResult.Matches?
Note that in both cases we have disordered list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm creating new lists in all cases, although I could use matchResult.Matches in most cases and allocate and use these lists only when the SimpleMatch parameter is enabled. If the SimpleMatch parameter is enabled, then the current behavior is that Matches does not contain any Match objects. Without any Match objects, I would need to pass in these lists to implement the colored text when SimpleMatch is enabled.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid creating or populating lists in the (possibly common) case where Emphasize is not specified or where VT100 isn't available. We don't want a performance hit in cases where there's no benefit.

That might look something like:

List<int> indexes;
List<int> lengths;

if (Emphasize)
{
    indexes = new List<int>();
    lengths = new List<int>();
}

...

if (Emphasize)
{
    foreach (Match match in matches)
    {
        indexes.Add(match.Index);
        lengths.Add(match.Length);
    }
}

// etc.

...
    matchResult = Emphasize
        ? new MatchInfo(indexes, lenghts)
        : new MatchInfo();

    matchResult.IgnoreCase = !CaseSensitive;
    Line = operandString;
    // etc.
...

@@ -71,6 +71,33 @@ Describe "Select-String" -Tags "CI" {
$testinputone | Select-String -Pattern "goodbye" -NotMatch | Should -Be "hello", "hello"
}

it "Should return an array of matching strings with the first match highlighted when Emphasize is used" {
if ($IsWindows)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't Windows support VT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah rather than doing $IsWindows you can do:

$Host.UI.SupportsVirtualTerminal

Which returns a bool.

Copy link
Collaborator

@rkeithhill rkeithhill Feb 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first version of Windows 10 (1507) did not support VT. Probably not many folks still using that version but $Host.UI.SupportsVirtualTerminal works just as well. And you may be running in a different host than conhost.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But those hosts would just not display the color, right?

Would also be best to consider https://no-color.org/

@felixfbecker
Copy link
Contributor

Does it highlight capture group like in #2905?

@vexx32
Copy link
Collaborator

vexx32 commented Feb 24, 2019

@joeyaiello mentioned on Twitter that might be worth considering just making this behaviour the default. Perhaps that might be worth doing? If there are concerns with stability / how different conhosts would work with this we can always stick it behind an experimental feature flag. @TylerLeonhardt, what do you think?

@TylerLeonhardt
Copy link
Member

@vexx32, I think @felixfbecker brought up a very good point - this is more of a formatting concern.

Looking at the formatting code, it seems that formatting MatchInfo to be printed to the console is done here. Where it simply runs:

$_.ToString(((get-location).path))

where $_ is the instance of MatchInfo

All it's doing is calling the ToString. Perhaps what would be better is move @derek-xia's logic added to ToString to a new public method on MatchInfo called Emphasize and then the formatting line linked above could just call:

$_.Emphasize(((get-location).path))

Does that sound alright @vexx32 @felixfbecker?

@rkeithhill
Copy link
Collaborator

Or maybe call it ToAnsiString().

@iSazonov
Copy link
Collaborator

Considering that we would like to receive color input for any output type, this should not be in every single cmdlet, it should be out.

@powercode
Copy link
Collaborator

I don't think this change should not be made in Select-String.

You can check out this gist that adds coloring of the matchinfo output as a custom format. https://gist.github.com/powercode/4833804efd23045387bd5d5249d76f7b

There already exists an issue for improving the formatting of MatchInfo, but this is a bit of "*nix" thinking where flags on a command modify the output. Make it a view in the formatting system instead.

@rjmholt
Copy link
Collaborator

rjmholt commented Feb 25, 2019

I feel like we're all agreeing here. Just want to summarise:

  • This behaviour should be in the formatter
  • Most efficient implementation is probably as @rkeithhill suggests with a ToAnsiString() method on MatchInfo objects. Possibly ToEmphasizedString() if we want that to return something still useful on a platform that does not support ANSI/VT control codes (like emphasis characters around the match).
  • Given that it would be in the formatter, we could have it on by default (can changing formatting be a breaking change?)

My only question is: should there be a switch for this on the cmdlet?

@vexx32
Copy link
Collaborator

vexx32 commented Feb 25, 2019

If I recall correctly, changes to formatting are not considered "breaking."

Emphasis characters are kind of hard to argue for, really. There's no guarantee such characters won't already be somewhere in or around the matched string, unfortunately. Might still be worth doing, I suppose, but honestly I feel simply omitting the formatting where it's not supported is likely to be most effective.

I don't think there need be a switch for it. Maybe define one formatview for emphasized and one for non-emphasized?

@rjmholt rjmholt added the Hacktoberfest Potential candidate to participate in Hacktoberfest label Feb 25, 2019
@iSazonov
Copy link
Collaborator

I like the coloring but I am not a fan of turning PowerShell into an artist. We do not embed a code formatter, but simply publish AST API. The same with coloring. Perhaps it is not formatting system api but conhost api.
Turn on/off:

$PSColoring = $true # Globally
Get-Date | Format-List -Coloring # Locally

@rjmholt
Copy link
Collaborator

rjmholt commented Feb 25, 2019

@iSazonov I agree with you in principle, but I think this kind of match emphasis brings PowerShell up to speed with grep in stakes that PowerShell is usually better at: interactive experience.

If we can get this into the formatter, then we can remove or refactor it later with no obligation not to break things.

So making the suggested change in the immediate gets us what we want, without committing us to anything down the road (except a public API on MatchInfo that we shouldn't guarantee dependable behaviour on).

The kind of change you're talking about will require committee approval and would be a whole heap of work. It's wanted, but we can have something small in the meantime too.

@rjmholt
Copy link
Collaborator

rjmholt commented Feb 25, 2019

$PSColoring = $true # Globally
Get-Date | Format-List -Coloring # Locally

Maybe worth writing up an RFC for this?

@iSazonov
Copy link
Collaborator

iSazonov commented Feb 25, 2019

we can have something small in the meantime too.

We already have the "small" - it is Jason's PSMore. Implementing vNext formatting system based on classes simplify adding new feature such as coloring (new interface?). So I think right way is implementing PSMore project first. (Before Coloring RFC)

@felixfbecker
Copy link
Contributor

Advanced output formatting is and has always been part of PowerShell - if your shell is object based, and you also want to provide a strong interactive experience, you need a good formatting system, that supports tables, nested data, grouping, truncation, and coloring too. But all of this is separate from the data and API. It shouldn't be exposed in a parameter switch, nor in the returned strings.

I like the idea of respecting the user's choice regarding not wanting color output. However I feel like PowerShell has the opportunity here to handle that on a shell level? E.g. it could define a well-known global variable $PSEnableColors that if set to $false, will make PowerShell trim all color escape sequences from formatting output, so that not each cmdlet has to deal with it on their own.

@TylerLeonhardt
Copy link
Member

@iSazonov... Do you mean https://github.com/lzybkr/PSMore

Not sure what the status is on that project... (cc @lzybkr)

I want to also remind everyone here that some folks that submit PRs do not know everything about PowerShell and we should treat that as a learning experience to teach folks about PowerShell-specific concepts.

@powercode
Copy link
Collaborator

I think the jury is still out on how do handle colors. I have started on an implementation for better coloring of select-string, but deferred it until we know what to do with colors. That opens up a can of worms, with theming etc. (which may be very important for the color-blind).

@derek-xia derek-xia force-pushed the Select-String-Color branch from 9e6b1e1 to f9145bd Compare August 27, 2019 19:33
Copy link
Collaborator

@rjmholt rjmholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SteveL-MSFT
Copy link
Member

Looks like the last issue is that the test needs to be fixed

@TylerLeonhardt
Copy link
Member

there was a missing brace so I quickly added it :) lets see what CI says but this should be good to go.

@@ -127,6 +127,7 @@ Describe "Select-String" -Tags "CI" {

It "Should return ParameterBindingException when -Raw and -Quiet are used together" {
{ $testinputone | Select-String -Pattern "hello" -Raw -Quiet -ErrorAction Stop } | Should -Throw -ExceptionType ([System.Management.Automation.ParameterBindingException])
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This brace seems out of place? I think this whole file should be reformatted so the intention is correct making it easier to understand the braces.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe that would be better done in a quick follow-up PR so we can keep track of the diffs and so accidental changes are made?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reformat can be a single commit. Then the fix of the brace a separate commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That test isn't touched in this PR. My last commit just undid an accidental deletion of a brace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TylerLeonhardt perhaps you can just add a commit after reformatting the tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with the formatting exactly? I don't see anything wrong with the changes in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not this PR specifically, but if you view the whole file, the indentation is all over the place. Makes it hard to review the code. I still can't tell how that missing brace fixed the code since the indentation isn't consistent.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation issue is because about half the file is indented with tabs instead of spaces. Github renders a tab as 8 spaces, whereas most editors commonly only render it with 4 spaces (though this is configurable).

I think VS Code has a convert tabs to spaces option in its command palette to get rid of the tabs, and then you can do a quick skim over the file to fix any remaining issues.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see. For this repo, we've standardized on spaces.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@ghost ghost added the Waiting on Author The PR was reviewed and requires changes or comments from the author before being accept label Sep 28, 2019
@TylerLeonhardt
Copy link
Member

@SteveL-MSFT please rereview

Copy link
Member

@SteveL-MSFT SteveL-MSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TylerLeonhardt I realize my feedback is in the existing code and not part of this specific PR. So if you want to fix it as part of this PR, that's fine, or if you want to submit a new PR that's fine too.

/// Surrounds the matched text with virtual terminal sequences to invert it's color. Used in ToEmphasizedString.
/// </summary>
/// <returns>The matched line with matched text inverted.</returns>
private string EmphasizeLine()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good candidate to be static.

Copy link
Member

@TylerLeonhardt TylerLeonhardt Oct 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it uses _matchIndexes? Or do you mean refactor it to be static that takes in _matchIndexes and _matchLengths

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can pass _matchIndexes by parameter.

Copy link
Collaborator

@rjmholt rjmholt Oct 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Just like a whole bunch of other PowerShell code that is similarly caught up by mutated object state implicitly passed in to methods.

But I believe that pre-existed here and this contribution shouldn't be held back on the basis of code debt incurred before PowerShell was open-sourced. This PR has been open long enough and has been a good, responsive, constrained contribution and I think we really need to just accept or reject it, rather than turning it into a Ship of Theseus.

Copy link
Collaborator

@vexx32 vexx32 Oct 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this one's gone on long enough and is a definite nice-to-have in PSv7 😁

We can always make other smaller changes later. 🙂

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is delayed because it is in new area for PowerShell. However, the code should be clean.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this isn't a public API, we can always refactor internal code. I would suggest a separate PR if it's a concern or @iSazonov you can always add a commit to this PR if you feel strongly about this one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can :-) Do you ok that we have over 70 opened PR? :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iSazonov I'll ask the team to start driving down the number of open PRs

@SteveL-MSFT SteveL-MSFT removed the Waiting on Author The PR was reviewed and requires changes or comments from the author before being accept label Oct 1, 2019
@adityapatwardhan adityapatwardhan merged commit 4a4dc4c into PowerShell:master Oct 8, 2019
@adityapatwardhan
Copy link
Member

@derek-xia Thank you for your contribution!

@adityapatwardhan adityapatwardhan added this to the 7.0.0-preview.5 milestone Oct 8, 2019
@TylerLeonhardt
Copy link
Member

AT LAST! Well done @derek-xia 🎉

@ghost
Copy link

ghost commented Oct 23, 2019

🎉v7.0.0-preview.5 has been released which incorporates this pull request.:tada:

Handy links:

kilasuit pushed a commit to kilasuit/PowerShell that referenced this pull request Oct 31, 2019
kilasuit pushed a commit to kilasuit/PowerShell that referenced this pull request Nov 9, 2019
@TylerLeonhardt
Copy link
Member

TylerLeonhardt commented Nov 28, 2019

I'm just gonna leave this here...

https://twitter.com/fatherjack/status/1199807445037723648?s=19

😁

Twitter
“Can everyone upgrade to #PowerShell 7 right away please? I have seen the future and it is select-string with match highlighting.”

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CL-General Indicates that a PR should be marked as a general cmdlet change in the Change Log Committee-Reviewed PS-Committee has reviewed this and made a decision Hacktoberfest Potential candidate to participate in Hacktoberfest
Projects
None yet
Development

Successfully merging this pull request may close these issues.