Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Sort Order Lexicographic Options setting for Explorer #97272

Merged
merged 13 commits into from
May 6, 2021

Conversation

leilapearson
Copy link
Contributor

Per @isidorn I'm opening this PR so we can discuss the approach to resolving requests to group file and folder names by case. A number of people have requested this functionality and these requests were consolidated into issue #27759.

My suggested approach is to add a new setting - which I've tentatively called Sort Order Option so that it will show up immediately beneath the existing Sort Order setting.

image

This PR demonstrates the UI and proposed wording for the setting. The new setting isn't hooked up to anything yet and just serves to show the descriptions for the new options and the overall approach.

The options themselves are trivial to implement once we agree on which options to offer and not offer, and on the best way to offer these options in the settings UI.

I'll update the PR with the full functionality once the approach is agreed.

@isidorn
Copy link
Contributor

isidorn commented May 11, 2020

@leilapearson thanks for creating this PR for discussion.
Overall I think it makes sense. It is impossible to put both sortOrder and sortOrderOptions under one setting since ther are too many combinations.

The descriptoin of the sortOrderOptions setting could be improved, currently it reads to me just like a synonum of the sortOrder setting.
Also do you think that all those sortOrderOption values are really needed? Since we can easily later add some if users request them.

@bpasero would like to hear what you think as well.

@isidorn isidorn added file-explorer Explorer widget issues under-discussion Issue is under discussion for relevance, priority, approach labels May 11, 2020
@bpasero
Copy link
Member

bpasero commented May 11, 2020

I am having a hard time to understand the intent based on looking at the code and seeing how the setting is presented in the settings editor. Maybe we can first explain the intent and then work on the naming? Sorry if I was not reading through the linked issues but I think I am playing the role as a user that is seeing this for the first time and I would not have a clue what the new setting does. The one looks like it replaces the other, but not sure?

@leilapearson
Copy link
Contributor Author

Thanks @isidorn and @bpasero.

I wholeheartedly agree it's important for users to be able to understand the settings. I'm sure there's some way we can name and describe things to be more clear.

I actually think I improved the wording somewhat already compared to the first PR, but obviously not enough...

The key words in the descriptions that I was hoping would help in this case were files and folders versus names...

Here's an explanation of how the settings are currently proposed to work:

Sort Order specifies the first level of comparison. It specifies how items should be grouped and sorted based on properties other than their names. In particular, it pays attention to the following properties:

  • Is the item a file or a folder? (the mixed option doesn't care - but the other options do care)
  • When was the item modified? (only the modified option cares)
  • What kind of file is this? (only the type option cares)

Sort Order Option is the second level of comparison. Sort Order Option is all about the names:

  • Do I want to use locale order or unicode order when comparing names?
  • If I'm using locale order, do I want to treat numbers numerically or alphabetically?
  • If I'm using locale order, do I want to group names based on their case? If so, uppercase first or lowercase first?

So Sort Order says, first let's compare items using some non-name properties. Then Sort Order Option says, now that we've compared based on those properties - for items that are equal using that first level of comparison, let's go on and compare based on the item names.

So... that's what the settings are meant for as they currently stand.

The actual algorithm can be thought of like this:

  1. compare by files first, by folders first, or skip this step
  2. compare by extension (case insensitive extension) or skip this step
  3. compare by modified date or skip this step
  4. compare file names by uppercase first, by lowercase first, or skip this step
  5. compare file names using one of the below options:
    a. name by locale with numeric on, then extension by locale with numeric on. (in both cases, disambiguate numerically equivalent strings by length).
    b. name by locale with numeric off, then extension by locale with numeric off.
    c. full name (name + extension) by unicode order

In all cases, the algorithm returns as soon as it finds a difference.

Sort Order provides limited control over steps 1 through 3.

By limited control over steps 1 through 3, I mean you can't - for example - choose to sort filesFirst and then also sort by extension and then also sort files with the same extension by modified date - though that would be a perfectly valid thing to do logically speaking. It just isn't an available option.

Sort Order Option provides limited control over steps 4 and 5.

By limited here I mean you can't combine unicode order with upper first or lower first. These would be logically valid things to offer, but it seem unlikely that users would want them. Unicode already groups upper first just because that is how characters happen to be ordered in unicode, and it would seem kind of weird to call something unicode order that wasn't entirely unicode order.

Without Sort Order Option available (current master), Sort Order skips step 4 and is hardcoded to use option 5a.

Not sure if any of this explanation helps for thinking about how to present things in the settings UI - but I thought I'd write it down. If nothing else, it clarifies things a bit more for me.

Given all of this, any ideas on how to rework or improve on the proposed settings UI?

I'm hoping its just a matter of better naming and description wording - but I'm open to anything.

I'll see what I can come up with too...

@bpasero
Copy link
Member

bpasero commented May 12, 2020

Some ideas for the setting name:

  • explorer.nameSortOrder
  • explorer.fileNameSortOrder
  • explorer.lexiographicSortOrder
  • explorer.nameSortStrategy
  • explorer.fileNameSortStrategy
  • explorer.lexiographicSortStrategy

@isidorn maybe consult with Andre too who is usually very good in wording 👍

@isidorn
Copy link
Contributor

isidorn commented May 12, 2020

My vote goes to explorer.lexicographicSortOrder

@weinand for potential setting name ideas.
tl;dr: we are introducing a new explorer sort order setting that is ortogonal and is a fine tuning of the existing explorer.sortOrder

@weinand
Copy link
Contributor

weinand commented May 12, 2020

For me as a user lexicographicSortOrder does not sound like a setting, but more like one specific value of a setting, e.g. "sortOrder": "lexicographic".

But from reading through the issue, that seems not to be the intent.

My take:
There is a setting sortOrder which controls the first level of sorting based on file properties (file/folder, modification timestamp, file kind). All possible options mention an (implicit) second level of sorting based on the name of the file: "alphabetical" order.

And the new setting controls exactly this second level sorting.

My recommendation:

  • use consistent naming across the two settings: either "alphabetical" or "lexicographic" (see https://en.wikipedia.org/wiki/Lexicographical_order: alphabetical is a synonym of lexicographic). I prefer "lexicographic".
  • since the second level sorting does not offer fundamentally different sorting options but just variants of the same "lexicographic" sorting, I suggest to use explorer.lexicographicOptions.
  • if explorer.lexicographicOptions has the problem that the term "sorting" or "sort" is missing I see two solutions:
    • explorer.sortOrder.lexicographicOptions: makes clear that this setting is a sub-setting (and sorts nicely ;-)
    • explorer.lexicographicSortOptions: closer to original proposal.

@isidorn
Copy link
Contributor

isidorn commented May 12, 2020

@weinand thanks a lot for great feedback!
explorer.sortOrder.lexicographicOptions works best for me. Shows the relationship between the two and has a clear name.
I know we do not have any other "subections" in explorer but I would still go for this one.

@leilapearson
Copy link
Contributor Author

I really like the subsection idea too!

I'm wondering if it should possibly be explorer.sortOrder.lexicographicOption singular instead of plural since you can only actually choose one option?

@isidorn
Copy link
Contributor

isidorn commented May 12, 2020

We always use plural in our settings, so for consistency I would go with options

Screenshot 2020-05-12 at 13 58 34

@leilapearson
Copy link
Contributor Author

Okay sure. Just thought I'd check.

@leilapearson
Copy link
Contributor Author

With the name decided, I guess the next step is to decide specifically what lexicographic options should be offered. My original proposal was this:

'numeric', 'Uppercase and lowercase names are mixed together. Numbers are sorted numerically, not alphabetically.'
'upper', 'Uppercase names are grouped together before lowercase names. Numbers are sorted alphabetically.'
'lower', 'Lowercase names are grouped together before uppercase names. Numbers are sorted alphabetically.'
'mixed', 'Uppercase and lowercase names are mixed together. Numbers are sorted alphabetically.'
'unicode', 'Names are sorted in unicode order.'

Thinking about it now though, I'm wondering if it might be better to always sort numbers numerically instead of alphabetically, which would make the options a bit simpler:

'mixed', 'Uppercase and lowercase names are mixed together.'
'upper, 'Uppercase names are grouped together before lowercase names.'
'lower', 'Lowercase names are grouped together before uppercase names.'
'unicode', 'Names are sorted in unicode order.'

In this case unicode would be the only option where 10 would come before 2...

A dictionary will sort numbers numerically - not by digit comparisons.

The downside is, if there is some need to add a non-numeric lexicographic options in the future, then I think the naming of these options would be kind of difficult... For example mixedNonNumeric? mixedAlphabeticNumbers?

I'm not sure that there will be a future need for this though so maybe it isn't something to worry about now.

The primary uses of numbers in file names are probably for version numbers - where the numeric option is handy - and dates - where many people probably use leading zeros, but it can be handy if they don't.

Originally I was swayed towards non-numeric sorts by a comment in the issue that stated that people will just use leading zeros and are used to that. And then I was more swayed by the fact that a normal locale sort doesn't enable the numeric option and you have to specifically turn it on. But thinking about it some more I'm leaning the other way. People who are used to adding leading zeros are probably used to unicode sort - and that option will be available.

Does Windows sort numerically BTW? It looks like OSX sorts numerically inside file explorer but in unicode order in the terminal. Google drive sorts non-numerically.

@weinand
Copy link
Contributor

weinand commented May 12, 2020

@leilapearson BTW, you might want to use the term "natural sort order" when referring to sorting of strings with embedded numbers.

@leilapearson
Copy link
Contributor Author

@weinand - yes... I actually did take a quick look at python's natsort while I was thinking about this. I do like the name.

Of course python's natsort is much more sophisticated than what you get by simply turning on the numeric option in an Intl.collator. Intl.collator with numeric = true doesn't handle thousands separators, non-integer numbers, negative numbers, and the different representations for all of these. So even with numeric = true we don't really have a natural sort.

See: https://natsort.readthedocs.io/en/master/howitworks.html#here-be-dragons-adding-locale-support

@leilapearson
Copy link
Contributor Author

Looks like somebody did create a javascript natsort so I guess that would potentially be an option:

https://www.npmjs.com/package/natsort

Not sure about performance or the need to handle so many cases, so I'm not proposing to adopt it now - but it's something to keep in mind I guess.

@isidorn
Copy link
Contributor

isidorn commented May 12, 2020

I am all up for reducing the options to have 4 values. We can add more if users request it.

@leilapearson
Copy link
Contributor Author

Okay. Sounds good. I'll go ahead with those four options and update the pull request when it's ready.

@isidorn
Copy link
Contributor

isidorn commented May 13, 2020

Great, thanks

The new plan is for all locale-based comparisons to use the `numeric`
collator option. The compare options will only differ by case grouping.
Rename compareFileNamesNumeric to compareFileNamesMixed and
rename compareFileExtensionsNumeric to compareFileExtensionsMixed.
Add options for controlling the lexicographic order of file and folder
names in Explorer.

Adds options to group names uppercase first, group names
lowercase first, or sort using a unicode comparison instead
of a locale-based comparison.
@leilapearson
Copy link
Contributor Author

I've made the changes, but I ran into a couple of issues:

  1. It turned out that I couldn't name the new setting explorer.sortOrder.lexicographicOptions - or at least I couldn't figure out how to do that without changing the existing explorer.sortOrder to something like explorer.sortOrder.propertyOptions. I assume that would have upgrade implications, so instead I named the new setting explorer.sortOrderLexicographicOptions. Hope this is okay. If not, suggestions welcome.

  2. I needed to merge in the latest from master and now the build is failing so I wasn't able to test the merged code except with my unit tests. I tried downloading a fresh clone straight from microsoft/vscode too and I'm getting the same issue that I get in my fork.

The error is below:

[21:34:17] Starting compile-extension:html-language-features-server ...
[21:34:22] Error: /Users/leila/dev/wip/vscode/extensions/search-result/src/extension.ts(94,10): No overload matches this call.
  Overload 1 of 2, '(selector: DocumentSelector, provider: DocumentLinkProvider<DocumentLink>): Disposable', gave the following error.
    Type '(document: TextDocument, token: CancellationToken) => Promise<DocumentLink[]>' is not assignable to type '{ (document: TextDocument, token: CancellationToken): ProviderResult<DocumentLink[]>; (document: TextDocument, token: CancellationToken): ProviderResult<...>; }'.
      Type 'Promise<DocumentLink[]>' is not assignable to type 'ProviderResult<T[]>'.
        Type 'Promise<DocumentLink[]>' is not assignable to type 'Thenable<T[] | null | undefined>'.
          Types of property 'then' are incompatible.
            Type '<TResult1 = DocumentLink[], TResult2 = never>(onfulfilled?: ((value: DocumentLink[]) => TResult1 | PromiseLike<TResult1>) | null | undefined, onrejected?: ((reason: any) => TResult2 | PromiseLike<...>) | ... 1 more ... | undefined) => Promise<...>' is not assignable to type '{ <TResult>(onfulfilled?: ((value: T[] | null | undefined) => TResult | Thenable<TResult>) | undefined, onrejected?: ((reason: any) => TResult | Thenable<TResult>) | undefined): Thenable<...>; <TResult>(onfulfilled?: ((value: T[] | ... 1 more ... | undefined) => TResult | Thenable<...>) | undefined, onrejected?: ((r...'.
              Types of parameters 'onfulfilled' and 'onfulfilled' are incompatible.
                Type '((value: T[] | null | undefined) => any) | undefined' is not assignable to type '((value: DocumentLink[]) => any) | null | undefined'.
                  Type '(value: T[] | null | undefined) => any' is not assignable to type '(value: DocumentLink[]) => any'.
                    Types of parameters 'value' and 'value' are incompatible.
                      Type 'DocumentLink[]' is not assignable to type 'T[]'.
                        Type 'DocumentLink' is not assignable to type 'T'.
                          'T' could be instantiated with an arbitrary type which could be unrelated to 'DocumentLink'.
  Overload 2 of 2, '(selector: DocumentSelector, provider: DocumentLinkProvider<DocumentLink>): Disposable', gave the following error.
    Type '(document: TextDocument, token: CancellationToken) => Promise<DocumentLink[]>' is not assignable to type '{ (document: TextDocument, token: CancellationToken): ProviderResult<DocumentLink[]>; (document: TextDocument, token: CancellationToken): ProviderResult<...>; }'.
[21:34:22] Finished compilation with 1 errors after 12825 ms

@leilapearson
Copy link
Contributor Author

Forgot to mention that I did end-to-end test all my changes before the merge from master and everything looks good to me. I verified that the changes in the previous PR that fixed the edge cases were preserved. In this PR I renamed those compare functions and pulled their code out into helper functions for the new options to reuse. The new changes look to be working as intended too.

The results can be seen here:

https://docs.google.com/spreadsheets/d/1Kw1eES9o5fs3q3IXvthdwx12zqLOr37hLXVvc1Aa3rE/edit?usp=sharing

@isidorn
Copy link
Contributor

isidorn commented May 14, 2020

@leilapearson thanks a lot. I just review the changes in the Explorer, and overall it looks very good.

  1. Do not worry about that error, just get latest, git clean -fxd and do yarn && yarn watch again
  2. I see that we can not do sortOder. so it is fine to go with sortOrderLexicographicOptions
  3. I would prefer if the explorerService only had one public getter for sort things and that would return an object that has both sortOrder and the new lexOptions
  4. There are quite some new helper methods added in comparer.ts, I am not sure if all of them are needed since there might be alternative to use in other helper files.

So bottom line: changes in Explorer land look great.
When @bpasero finds time he can review the Comparer. If he does not have time I can review that some time next week.

@isidorn isidorn added this to the June 2020 milestone May 14, 2020
@leilapearson
Copy link
Contributor Author

Thanks @isidorn. Helpful as always!

  1. Successfully building again! Thanks for the help. I wasn't sure what was the correct way to perform a clean build. FYI I also had to kill and restart the incremental builder. I hadn't realized it kept running even when I closed vscode. It does mention this in the contributing wiki but it's kind of buried.
  2. Great, thanks.
  3. Sure, no problem. I can change it to a single object.
  4. Thanks. I'll stay tuned for the review(s).

@leilapearson
Copy link
Contributor Author

Regarding issue #107936 filename_test.go appears before filename.go in the explorer, see issue #99955 and PR #104528. These are both closed but they explain why the code now behaves this way. The code no longer strips off the .go extension before comparing the names.

@ghost
Copy link

ghost commented Dec 5, 2020

I heartily agree with @leidegre above statement. If I may say @isidorn, code hosting platforms like GitHub, GitLab and even BitBucket use uppercase first as the default preference for lexicographic. This cause Visual Studio Code to visually break repositories lexical order which is a no-no when it comes to a source code editor.

@tylermenezes
Copy link

potentially merge it in the future when we get more user feedback.

Count me as another user who wants this -- it's quite useful for me in typescript where Uppercase files are types and lowercase files are methods.

(I'm not sure why you're avoiding merging things that provide more options to users. If you're trying to keep the settings page from becoming too cluttered, that ship has sailed...)

@joaomoreno joaomoreno changed the base branch from master to main February 15, 2021 08:51
@jbatez
Copy link

jbatez commented Mar 10, 2021

Count me as another user who wants this. I've been using capitalization to sort certain files (CMakeLists.txt, README, LICENSE, etc) before source code files for decades. With VS Code's current options, these files always get sorted into the middle of all my other files and it's very jarring and inconsistent with the other tools I use.

@markvl
Copy link

markvl commented Apr 15, 2021

This would be a great feature to have. In the projects I work with, there are usually 'meta files' that are capitalized or start with a capital (CHANGELOG, LICENCE, README, Makefile). By having an option to sort files in a case sensitive fashion they are visually separated from the actual source code in the explorer, making it easier to navigate through a project.

Big thanks to @leilapearson for your effort to get this feature in!

@leilapearson
Copy link
Contributor Author

Just wanted to mention @isidorn that I'm still happy to work on this again if/when you are okay to consider it for merging. There definitely does seem to be some interest per the above comments.

There has also been some discussion in other tickets about the idea of enabling non-numeric sort for a use case I hadn't really considered which was filenames with hex numbers in them. Those sort very oddly with numeric sort enabled, so we may want to reconsider the idea of non-numeric options for locale-based searches too. On the other hand, the unicode sort option may be enough meet the needs of those users. Since I don't work or write in any languages other than english I'm not sure how important a locale-based non-numeric option would be.

@isidorn
Copy link
Contributor

isidorn commented Apr 16, 2021

@leilapearson thanks a lot for offering to work on this.
I see the interest and I think this would be a nice addition to VS Code.
However if you have time let's try to make this PR as concise as possible, I see that a lot of changes are being done in the comparer.ts, could we somehow slim this down? Also I see that there are conflicts, could you resolve those?

@leilapearson
Copy link
Contributor Author

Thanks @isidorn. I'll see what I can do to make it as clear and concise as possible. It's great to be able to contribute this feature!

@leilapearson
Copy link
Contributor Author

@isidorn I've merged the latest main into my fork and topic branch, resolved the conflicts, and tested the change. I haven't reviewed comparers.ts yet to see if I can make it any more concise. I'll take a look at that next.

@leilapearson
Copy link
Contributor Author

@isidorn I shaved a few lines off in comparers.ts.

@isidorn isidorn modified the milestones: On Deck, May 2021 May 6, 2021
@isidorn
Copy link
Contributor

isidorn commented May 6, 2021

Looks great. I did an initial review left minor comments and once you tackle them I can merge this.
Thanks a lot for writing tests 👏
Did you try this out? If yes on what platform? So I try it out on the others.

@leilapearson
Copy link
Contributor Author

Thanks for the review @isidorn. I made one small change per your comments but I'm not sure about the other two. Please see my comments and let me know what you think.

My testing was all done on macOS Catalina (10.15.7) - so if you can try this out on Windows and Linux that would be great!

@isidorn
Copy link
Contributor

isidorn commented May 6, 2021

Great, thanks for tackling my comments. Than my last tiny comment I will try it out and if it works nicely I will merge this in.

@leilapearson
Copy link
Contributor Author

Thanks for the re-review @isidorn. Should be good to go now I think!

@isidorn
Copy link
Contributor

isidorn commented May 6, 2021

Looks good to me, and works just fine on my machine.
I did not do an intensive testing. So I aks the community to try this out with VS Code insiders that will be out tomorrow and to give us feedback. Really appreciate it
@leilapearson thanks again for a great pr 👏

@isidorn isidorn merged commit f280f4c into microsoft:main May 6, 2021
@leilapearson
Copy link
Contributor Author

Thank-you for the reviews and great suggestions @isidorn. Glad to help.

@gusbemacbe
Copy link

@leilapearson, @isidorn and @bpasero,

I have tested it and it worked. Thank you for fixing it!

@markvl
Copy link

markvl commented May 11, 2021

FWIW I've tested this on a Ubuntu machine (code-insiders package version 1.57.0-1620624357) and this also works for me as expected. Thank you very much!

@github-actions github-actions bot locked and limited conversation to collaborators Jun 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
file-explorer Explorer widget issues under-discussion Issue is under discussion for relevance, priority, approach
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants