Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doc comments: Handle CDATA correctly and avoid normalizing code elements #53140

Merged
merged 11 commits into from
Jun 24, 2021

Conversation

Youssef1313
Copy link
Member

@Youssef1313 Youssef1313 commented May 4, 2021

Fixes #53782
Fixes #53135
Fixes #37503
Fixes #22431

Before:

image


After:

image

@Youssef1313 Youssef1313 marked this pull request as ready for review May 4, 2021 12:40
@Youssef1313 Youssef1313 requested a review from a team as a code owner May 4, 2021 12:40
Comment on lines 106 to 110
Builder.Add(new TaggedText(TextTags.Text, s, Style, NavigationTarget.target, NavigationTarget.hint));
Builder.Add(new TaggedText(TextTags.Text, NormalizeLineEndings(s), Style, NavigationTarget.target, NavigationTarget.hint));

_anyNonWhitespaceSinceLastPara = true;

static string NormalizeLineEndings(string input) => Regex.Replace(input, "(?<!\r)\n", "\r\n");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd certainly prefer a better solution here. The issue here is that XText.Value produces a string with Unix-style line endings (\n), causing the end result to have mixed line-endings 😕

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this warrants a comment explaning what the regex means and waht this is here for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I normally use input.Replace("\r\n", "\n").Replace("\n", Environment.NewLine), but XML is required to normalize line endings to \n so I'm guessing you can just use input.Replace("\n", "\r\n") and be done.

Copy link
Member Author

@Youssef1313 Youssef1313 May 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharwell I'm a little bit worried if AppendString can be called with a string already containing \r\n, then doing Replace("\n", "\r\n") will duplicate \r.

Is it's guaranteed that XText.Value will always be normalized to \n?

https://www.w3.org/TR/xml/#sec-line-ends seems to claim that. I'll convert to Replace("\n", "\r\n"). Thanks @sharwell.

Comment on lines -172 to +176
symbol.GetDocumentationParts(_semanticModel, _position, formatter, CancellationToken).ToImmutableArray());
symbol.GetDocumentationParts(_semanticModel, _position, formatter, CancellationToken));

_documentationMap.Add(
SymbolDescriptionGroups.RemarksDocumentation,
symbol.GetRemarksDocumentationParts(_semanticModel, _position, formatter, CancellationToken).ToImmutableArray());
symbol.GetRemarksDocumentationParts(_semanticModel, _position, formatter, CancellationToken));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only avoids creating a new array since GetDocumentationParts already returns an immutable array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToImmutableArray() won't clone the array if the input is already an immutable array, but this does avoid boxing the IEnumerable<T> for the call and is cleaner so 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToImmutableArray() won't clone the array if the input is already an immutable array,

@sharwell Yeah that seem the case:

https://github.com/dotnet/runtime/blob/59d9d0d16858635a8c228be2ab7ad8882b0d1c71/src/libraries/System.Collections.Immutable/src/System/Collections/Immutable/ImmutableArray.cs#L372-L380

So the docs of CA2009 are inaccurate/incorrect?

https://docs.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca2009

Performance issue: Unnecessary creation of a duplicate immutable collection. The original collection was already immutable and can be used directly.

@sharwell
Copy link
Member

sharwell commented May 4, 2021

Ohhhh snap 🎉

// This is because XText.Value returns a string contaning `\n` as the line endings, causing
// the end result to have mixed line-endings. So normalize everything to `\r\n`.
static string NormalizeLineEndings(string input) => Regex.Replace(input, "(?<!\r)\n", "\r\n");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change


[WorkItem(37503, "https://github.com/dotnet/roslyn/issues/37503")]
[Fact, Trait(Traits.Feature, Traits.Features.QuickInfo)]
public async Task DoNotNormalizeWhitespaceForCode()
Copy link
Member

@sharwell sharwell May 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❗ Can you add both of these test cases to a new test class IntellisenseQuickInfoBuilderTests_Code.cs, which is modeled after IntellisenseQuickInfoBuilderTests_Styles.

@@ -50,7 +50,7 @@ public void ExampleAndCodeTags()
results in <c>p</c>'s having the value (2,8).
</example>";

var expected = "This method changes the point's location by the given x- and y-offsets. For example:\r\n\r\nPoint p = new Point(3,5); p.Translate(-1,3);\r\n\r\nresults in p's having the value (2,8).";
var expected = "This method changes the point's location by the given x- and y-offsets. For example:\r\n\r\n\r\n Point p = new Point(3,5);\r\n p.Translate(-1,3);\r\n \r\n\r\nresults in p's having the value (2,8).";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 This is one of the cases that made me uncomfortable, but I'll reserve judgement until I see the new builder tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharwell I'm expecting you'll still be uncomfortable after seeing the new tests 😄
Does it make sense to trim line-endings from code elements?

Copy link
Member

@sharwell sharwell May 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to look at screenshots before requesting changes to the actual normalization approach. Maybe there's no problem.

New ClassifiedTextRun(ClassificationTypeNames.WhiteSpace, " "),
New ClassifiedTextRun(ClassificationTypeNames.Text, "Not this", ClassifiedTextRunStyle.UseClassificationFont))),
New ClassifiedTextElement(
New ClassifiedTextRun(ClassificationTypeNames.Text, "\r\nline 1\r\nline 2\r\n", ClassifiedTextRunStyle.UseClassificationFont)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 I would expect this to be rendered like <br/>

Example input:

/// Documentation line 1.&lt;br/&gt;
/// Documentation line 2.

Example output:

New ContainerElement(
ContainerElementStyle.Stacked,
New ClassifiedTextElement(
New ClassifiedTextRun(ClassificationTypeNames.Text, "Documentation line 1.")),
New ClassifiedTextElement(
New ClassifiedTextRun(ClassificationTypeNames.Text, "Documentation line 2.")))),

Copy link
Member

@sharwell sharwell May 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have to run the code to see if it looks the same with the current form (unless you have a screenshot handy).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add screenshots shortly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharwell Screenshot added.

Copy link
Member Author

@Youssef1313 Youssef1313 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While playing around this, I found that indentation of code isn't preserved. This is due to:

_comment.SummaryText = TrimEachLine(reader.ReadInnerXml());

I'll open an issue to address this as a follow-up. Not sure if we should simply drop calling TrimEachLine or if it should understand the structure of XML and avoids trimming things inside <code> but trim other lines.

@jinujoseph jinujoseph added Feature Request Community The pull request was submitted by a contributor who is not a Microsoft employee. labels May 4, 2021
<Document>
class MyClass {
/// &lt;summary&gt;
/// Normalize this, but &lt;c&gt;Not this&lt;/c&gt;
Copy link
Member

@sharwell sharwell May 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 I thought <c> would be normalized. We should check with the publishing tools to see if this is the case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharwell This is still code, so I'm expecting it to preserve spaces. But no idea what tools like docfx do.

Copy link
Member

@sharwell sharwell May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main case where we definitely don't want to keep whitespace is this:

/// <summary>
/// Some <c>code
/// example</c> is here.
/// </summary>

The above should render exactly the same as this:

/// <summary>
/// Some <c>code example</c> is here.
/// </summary>

It may be worth adding tests to show these are the same.


Code = 0x8,
InlineCode = 1 << 4,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new TaggedTextStyle to differentiate <c> and <code>.

Currently Code value is 1000 and InlineCode is 10000. Should InlineCode be 1001 or whatever value to share the same most significant bit? (i.e, indicate that InlineCode is a subset of Code)?

This will allow the following check: (style & TaggedTextStyle.Code) == TaggedTextStyle.Code to pass whether style is Code or InlineCode.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current approach seems fine to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, it seems like the new addition should be:

PreserveWhitespace = 1 << 4,

@@ -695,7 +695,7 @@ static string GetStyledText(TaggedText taggedText, bool isInCodeBlock)
TaggedTextStyle.Strong => $"**{text}**",
TaggedTextStyle.Emphasis => $"_{text}_",
TaggedTextStyle.Underline => $"<u>{text}</u>",
TaggedTextStyle.Code => $"`{text}`",
TaggedTextStyle.Code or TaggedTextStyle.InlineCode => $"`{text}`",
Copy link
Member Author

@Youssef1313 Youssef1313 May 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few lines above this, there is:

                if (!isInCodeBlock)
                    text = text.Replace(" ", "&nbsp;");

This doesn't feel correct. See the following test:

https://github.com/dibarbet/roslyn/blob/0ef88b0a686892546b57e721ba65738872bf928d/src/Features/LanguageServer/ProtocolUnitTests/Hover/HoverTests.cs#L195

@sharwell Do you agree here? Do you want me to fix LSP in this PR or make that a separate PR? (I prefer LSP work to be separate - but no objection doing it in this PR)

cc: @dibarbet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate is fine.

Copy link
Member

@sharwell sharwell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change <code> to use TaggedTextStyle.Code | TaggedTextStyle.PreserveWhitespace so the whole enumeration continues to work as individual flags.

@Youssef1313
Copy link
Member Author

@sharwell Anything else needed here?

@Youssef1313 Youssef1313 requested a review from a team as a code owner May 18, 2021 20:33
@Youssef1313 Youssef1313 changed the base branch from main to release/dev16.11 May 18, 2021 20:33
@Youssef1313
Copy link
Member Author

@Youssef1313 We'd like to rebase this onto release/dev16.11. Are you able to do this or would you like us to go ahead and do it?

Done.

@sharwell
Copy link
Member

@jinujoseph for approval to merge

// XText.Value returns a string with `\n` as the line endings, causing
// the end result to have mixed line-endings. So normalize everything to `\r\n`.
// https://www.w3.org/TR/xml/#sec-line-ends
static string NormalizeLineEndings(string input) => input.Replace("\n", "\r\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this though? what is wrong with the doc comment having \n in it?

Copy link
Member

@sharwell sharwell Jun 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➡️ This function ensures that non-normalized XML content uses the same end-of-line character as s_newlinePart (defined above). It could be cleaned up in the future (e.g. if we wanted to use Environment.Newline instead), but the approach in this PR keeps the result consistent with current precedent.

{
// cast is safe since XCData inherits XText
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the above just be if (node is XText xtext) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➡️ Both approaches are fine. There isn't really a downside to the current approach so I prefer to not make a change here unless it's being done along with other changes.

@@ -550,6 +559,13 @@ private static string TrimCrefPrefix(string value)
private static void AppendTextFromTextNode(FormatterState state, XText element)
{
var rawText = element.Value;
if ((state.Style & TaggedTextStyle.PreserveWhitespace) == TaggedTextStyle.PreserveWhitespace)
{
// Don't normalize code from middle. Only trim leading/trailing new lines.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why trim the leading/trailing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➡️ This implements a fix which a test revealed:
#53140 (comment)

@sharwell sharwell merged commit c177fa8 into dotnet:release/dev16.11 Jun 24, 2021
@Youssef1313 Youssef1313 deleted the quick-info-cdata branch June 24, 2021 18:32
@Youssef1313
Copy link
Member Author

@sharwell Can you close the linked issues? Thanks!

@sharwell
Copy link
Member

@Youssef1313 Already done thanks 🎉🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Approved to merge Area-IDE Community The pull request was submitted by a contributor who is not a Microsoft employee.
Projects
None yet
4 participants