Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOLD for payment 2023-11-30] [$2000] Newline is removed from code block with a character before it #16982

Closed
6 tasks
kavimuru opened this issue Apr 5, 2023 · 76 comments
Assignees
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Internal Requires API changes or must be handled by Expensify staff Weekly KSv2

Comments

@kavimuru
Copy link

kavimuru commented Apr 5, 2023

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!


Action Performed:

  1. Send this message to a chat
    |
test

Case 1:

  1. Copy the whole message by ctrl/cmd+c
  2. Paste it to the composer
  3. Notice the ``` is at the same line as |
    This only happens with | character
    Case 2:
  4. Edit the message
  5. Notice the ``` is at the same line as |
    This happens to all character

Expected Result:

```should be on its own line

Actual Result:

` is at the same line as | because the newline is removed

Workaround:

unknown

Platforms:

Which of our officially supported platforms is this issue occurring on?

  • Android / native
  • Android / Chrome
  • iOS / native
  • iOS / Safari
  • MacOS / Chrome / Safari
  • MacOS / Desktop

Version Number: 1.2.95-0
Reproducible in staging?: y
Reproducible in production?: y
If this was caught during regression testing, add the test name, ID and link from TestRail:
Email or phone of affected tester (no customers):
Logs: https://stackoverflow.com/c/expensify/questions/4856
Notes/Photos/Videos: Any additional supporting documentation

Screen.Recording.2023-04-05.at.12.39.34.mov
Recording.142.mp4

Expensify/Expensify Issue URL:
Issue reported by: @bernhardoj
Slack conversation: https://expensify.slack.com/archives/C049HHMV9SM/p1680669746873359

View all open jobs on GitHub

Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~01a8161082bc95b676
  • Upwork Job ID: 1643736935146340352
  • Last Price Increase: 2023-05-02
@kavimuru kavimuru added Daily KSv2 Bug Something is broken. Auto assigns a BugZero manager. labels Apr 5, 2023
@MelvinBot
Copy link

Triggered auto assignment to @twisterdotcom (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details.

@MelvinBot
Copy link

MelvinBot commented Apr 5, 2023

Bug0 Triage Checklist (Main S/O)

  • This "bug" occurs on a supported platform (ensure Platforms in OP are ✅)
  • This bug is not a duplicate report (check E/App issues and #expensify-bugs)
    • If it is, comment with a link to the original report, close the issue and add any novel details to the original issue instead
  • This bug is reproducible using the reproduction steps in the OP. S/O
    • If the reproduction steps are clear and you're unable to reproduce the bug, check with the reporter and QA first, then close the issue.
    • If the reproduction steps aren't clear and you determine the correct steps, please update the OP.
  • This issue is filled out as thoroughly and clearly as possible
    • Pay special attention to the title, results, platforms where the bug occurs, and if the bug happens on staging/production.
  • I have reviewed and subscribed to the linked Slack conversation to ensure Slack/Github stay in sync

@twisterdotcom twisterdotcom added the External Added to denote the issue can be worked on by a contributor label Apr 5, 2023
@melvin-bot melvin-bot bot changed the title Newline is removed from code block with a character before it [$1000] Newline is removed from code block with a character before it Apr 5, 2023
@MelvinBot
Copy link

Job added to Upwork: https://www.upwork.com/jobs/~01a8161082bc95b676

@MelvinBot
Copy link

Current assignee @twisterdotcom is eligible for the External assigner, not assigning anyone new.

@MelvinBot
Copy link

Triggered auto assignment to Contributor-plus team member for initial proposal review - @0xmiroslav (External)

@melvin-bot melvin-bot bot added the Help Wanted Apply this label when an issue is open to proposals by contributors label Apr 5, 2023
@MelvinBot
Copy link

Triggered auto assignment to @PauloGasparSv (External), see https://stackoverflow.com/c/expensify/questions/7972 for more details.

@ImBIOS
Copy link

ImBIOS commented Apr 6, 2023

Proposal

Please re-state the problem that we are trying to solve in this issue.

The ``` is at the same line as | when we write it in a different line. We wanted to make it still in their own line.

What is the root cause of that problem?

In Markdown, the | symbol is used to separate columns in a table, and it is not intended to have a line break when it is copied and pasted. When you copy and paste a table created in Markdown, the | symbol is used to indicate the boundaries between columns, and it is preserved in the copied text.

What changes do you think we should make in order to solve the problem?

image

We need to manually check if this symbol exists and add a <br /> in HTML and \n in text format.

What alternative solutions did you explore? (Optional)

Reminder: Please use plain English, be brief, and avoid jargon. Feel free to use images, charts, or pseudo-code if necessary. Do not post large multi-line diffs or write walls of text. Do not create PRs unless you have been hired for this job.

There are several libraries out there to handle this problem, but it requires more changes in the code and I didn't recommend this method.

@JMGCode
Copy link

JMGCode commented Apr 6, 2023

Please re-state the problem that we are trying to solve in this issue.
The ``` is at the same line as | when we write it in a different line. We wanted to make it still in their own line.

What is the root cause of that problem?
It's a bad regex on the ExpensiMark replaceBlockWithNewLine function

What changes do you think we should make in order to solve the problem?
change regex

Solution video
https://drive.google.com/file/d/1endHanNM9x47XcqpmRXvB-SgS5l4Kw63/view?usp=share_link

Complete
https://drive.google.com/file/d/1_9Lf-JX49tW4FE1NR8KVoVL7mN8KJa19/view?usp=share_link

@MelvinBot
Copy link

📣 @JMGCode! 📣

Hey, it seems we don’t have your contributor details yet! You'll only have to do this once, and this is how we'll hire you on Upwork.
Please follow these steps:

  1. Get the email address used to login to your Expensify account. If you don't already have an Expensify account, create one here. If you have multiple accounts (e.g. one for testing), please use your main account email.
  2. Get the link to your Upwork profile. It's necessary because we only pay via Upwork. You can access it by logging in, and then clicking on your name. It'll look like this. If you don't already have an account, sign up for one here.
  3. Copy the format below and paste it in a comment on this issue. Replace the placeholder text with your actual details.

Screen Shot 2022-11-16 at 4 42 54 PM

Format:

Contributor details
Your Expensify account email: <REPLACE EMAIL HERE>
Upwork Profile Link: <REPLACE LINK HERE>

@JMGCode
Copy link

JMGCode commented Apr 6, 2023

Contributor details
Your Expensify account email: jmgcode@gmail.com
Upwork Profile Link: https://www.upwork.com/freelancers/~01471144f7a600d0ea

@MelvinBot
Copy link

✅ Contributor details stored successfully. Thank you for contributing to Expensify!

@eh2077
Copy link
Contributor

eh2077 commented Apr 6, 2023

Proposal

Please re-state the problem that we are trying to solve in this issue.

Two issues need to solve

  1. Copy and paste code block message wth pipe character before misses a line break
  2. '```' isn't displayed in new line in editing mode

What is the root cause of that problem?

Click to see RCA of the first issue

From Web(Chrome/Safari), when we copy message through ctrl/cmd+c, we set html selection to clipboard here.

We send following two new comments to dig the root cause

|
```
code
```
A
```
code
```

like

image

We add a log before this line and copy the above message through ctrl/cmd+c. The selection html is shown as below

image

If we paste in composer, then we convert html to markdown here

HTML selection

<comment><div>| </div><pre><span>code</span></pre></comment>

is converted into markdown

| ```
code
```

While HTML selection

<comment><div>A </div><pre><span>code</span></pre></comment>

is converted into markdown

A 
```
code
```

To figure out why the pipe | character case behaves differently, we can have a closer look at htmlToMarkdown method. The div tag(replaced with [block]) should also be translated into a line break for pipe | character case in replaceBlockWithNewLine. But there's a flaw in this regex that it can match pipe | character. So text.match(/[\n|>][>]?[\s]?$/) returns true and skips to add a line break. See regex testing picture below

image

The above is the RCA of copy and paste issue for pipe | character case.

Let's move forward to dig the line break issue in editing mode.

As the line break missing issue in editing mode is same for having pipe | character case, let's send following message to dig the root cause

A
```
code
```

We translate markdown text into html here and save following message html to backend.

A <pre>code<br /></pre>

When replacing markdown to html, we remove the line break after A and replace it with a space ' ' using this rule. That's why there's an extra space after A in the message html.

When editing a sent message, we call htmlToMarkdown to convert message html to markdown text here. The message html of above sent comment is

A <pre>code<br /></pre>

which is much simpler than the html actually rendered in the page.

In method htmlToMarkdown, the above message html is translated into following markdown text by applying newline rule and codeFence rule.

A ```
code
```

The root cause is the way we translate codefence from markdown-to-html and from html-to-markdown.

When digging the root cause, I went through this closed issue #12783 which explains the root cause and ends up do nothing.

What changes do you think we should make in order to solve the problem?

To fix the first issue for pipe | character case, we can remove the unnecessary pipe | character in this regex.

To fix the second issue, we can improve the regex rule to translate pre tag into code fence markdown. We handle line break before <pre> by improving the codeFence rule.

The key idea is to sort out cases before the <pre> tag and handle line break of each case.

The improved regex rule will look like

{
    name: 'codeFence',
    // We use to (<\/(\w+)>| )? to capture the closing tag or a space before <pre> tag
    // if g1 is undefined then the prefix is empty string
    regex: /(<\/(\w+)>| )?<(pre)(?:"[^"]*"|'[^']*'|[^'">])*>([\s\S]*?)(\n?)<\/\3>(?![^<]*(<\/pre>|<\/code>))/gi,
    replacement: (match, g1, g2, g3, g4) => {
        let prefix = '';
        if (g1) {
            if (g1 === ' ') {
                // We're using a trick to handle line break before <pre> tag, see 'replacebr' rule. It's like a magic to make line break between <pre> behave consistently on Web and native platforms.  Note that only text copied from Expensify App has space before <pre> tag. 
                // see https://meliorence.github.io/react-native-render-html/api/renderhtmlprops#enableexperimentalbrcollapsing
                // see https://github.com/Expensify/App/blob/91e2120e647aa9ee22171b6dbb772b4aebe64992/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js#L16
                prefix = '\n';
            } else if (/div|comment|h1|h2|h3|h4|h5|h6|p|li|blockquote/i.test(g2)) {
                // We just add the closing block type tag back
                prefix = `</${g2}>`;
            } else {
                // g1 is inline closing tag and we insert a line break after it
                prefix = `</${g2}>\n`;
            }
        }
        return `${prefix}\`\`\`\n${g4}\n\`\`\``;
    },
}

By doing so, we can pass all existing tests of Expensify-common and solve a bunch of twists related to this feature.

What alternative solutions did you explore? (Optional)

I also agree with @iwiznia to save the original text entered by user to solve this issue as he commented here.

@0xmiros
Copy link
Contributor

0xmiros commented Apr 9, 2023

There are 2 issues with different root causes.
Copy paste issue (Case 1) is related to | character only.
Edit issue (Case 2) happens on all characters.
@PauloGasparSv @twisterdotcom should we also consider Case 2 as bug and fix it here?

@melvin-bot melvin-bot bot added the Overdue label Apr 12, 2023
@PauloGasparSv
Copy link
Contributor

@PauloGasparSv @twisterdotcom should we also consider Case 2 as bug and fix it here?

I don't think we should consider fixing either anymore!

I just checked all proposals and @eh2077 found an older issue from 2022 where we already discussed Case 2. It links to a discussion on slack we had on both bugs (it also mentions other discussions we had on those) and back then we decided to do nothing! here:

Asked around to more people and we decided that:
so long as the edited version is semantically the same as the original (eg, both produce the same HTML), it's not a bug.
So, with that, going to close the issues I linked above.

Because of that I think we should close the issue and not fix anything, ok @twisterdotcom?
(Also, I'm not sure if @eh2077 proposal/investigation should be compensated since it helped decide what to do here)

@melvin-bot melvin-bot bot removed the Overdue label Apr 12, 2023
@0xmiros
Copy link
Contributor

0xmiros commented Apr 12, 2023

@PauloGasparSv that old issue was reported by me 🙂 and I was already aware of that . I double confirmed because it's stated in the issue description as Case 2. If we still don't wanna fix it, let's remove that case.

@PauloGasparSv
Copy link
Contributor

PauloGasparSv commented Apr 12, 2023

Didn't notice that @0xmiroslav :D

I think that both case 1 and case 2 were discussed in the slack threads linked by the issue you created (they mention pasting and editing)!

So I think we should fix nothing here as we already discussed both cases and nothing changed, does that make sense?

@0xmiros
Copy link
Contributor

0xmiros commented Apr 12, 2023

@PauloGasparSv makes sense, but that was old discussion. Can you confirm again on slack before close?

Here's inconsistency issue: when select text before copy and without select

inconsistency.mov

@bernhardoj
Copy link
Contributor

bernhardoj commented Apr 12, 2023

Regardless whether we want to handle the 2nd issue, I think we should handle the 1st case as it's only happening to | character which is not consistent with other character and @eh2077 already have a straightforward and simple solution to it.

@eh2077
Copy link
Contributor

eh2077 commented Apr 12, 2023

@PauloGasparSv Thanks for reviewing proposals!

But I think we should at least fix this regex demonstrated by case 1 as the pipe | character is not needed in the regex.

@0xmiros
Copy link
Contributor

0xmiros commented Apr 12, 2023

Personally, I think we should line break always at the start and end of code block. No reason we should do nothing despite of inconsistency between start and end, | and other characters, copy with selection and from context menu.

@eh2077
Copy link
Contributor

eh2077 commented May 9, 2023

@twisterdotcom I think we should fix the first case about the pipe | character.

Hmm, do I have luck to get some compensation for investigating the issue as @PauloGasparSv mentioned in comment #16982 (comment) ?

@twisterdotcom
Copy link
Contributor

I've sent an offer of $250 as a bonus for the investigation @eh2077. Thanks for your work on this here.

@eh2077
Copy link
Contributor

eh2077 commented May 9, 2023

@twisterdotcom thanks for the offer! Accepted.

@ahmdshrif
Copy link
Contributor

ahmdshrif commented May 10, 2023

conclusion from that old discussion is that so long as the edited version is semantically the same as the original (eg, both produce the same HTML), it's not a bug.

@cead22 @twisterdotcom
I respectfully disagree with the conclusion that as long as the edited version produces the same HTML as the original, it is not a bug. This is because when converting HTML to Markdown and back again, there can be issues that affect the editing process.

To elaborate, when the content is initially copied from an external resource such as GitHub, World, CodeEditor, etc. This content is in HTML format, which is then converted to Markdown so that the user can edit it in the composer. When the user submits the edited content, it is then converted back to HTML, when editing we again use HTML to markDown display markdown again in Composer.

During the editing process, the content goes through multiple conversions between HTML and Markdown, which can introduce bugs and issues that affect the editing experience. For example, if there is a bug in the conversion process from HTML to Markdown when editing, this bug will also appear in the initial conversion from HTML to Markdown (when copying from another resource).

In my proposal, I identified that the root cause of the issue was that we were not considering code blocks as inline blocks, unlike other inline-block elements. We already have a function to handle most inline blocks, and we can easily modify this function to include code blocks as well in one line.

@cead22
Copy link
Contributor

cead22 commented May 10, 2023

@ahmdshrif thanks for the input, but I'm not sure I fully understand it. If you want to continue the discussion, can you perhaps rephrase the first 3 paragraphs and use concrete examples to illustrate your point? Specifically around the following statements

This is because when converting HTML to Markdown and back again, there can be issues that affect the editing process.

During the editing process, the content goes through multiple conversions between HTML and Markdown, which can introduce bugs and issues that affect the editing experience

As for the one below, seeing as we've agreed that if the edited version produces the same HTML as the original then that's fine, what are you defining here as "the issue"?

In my proposal, I identified that the root cause of the issue

@ahmdshrif
Copy link
Contributor

ahmdshrif commented May 14, 2023

As for the one below, seeing as we've agreed that if the edited version produces the same HTML as the original then that's fine, what are you defining here as "the issue"?

In general:

  • If we ignore small problems like these, they can accumulate and lead to bigger problems later on. For example, if one HTML rule deletes a line and we overlook it, and another rule deletes space from the end of the line and we also overlook it, then these two minor problems can combine and result in two lines being concatenated together. That's why I think we should fix small problems as soon as we find them.
  • Another point to consider is that we do many conversions between HTML and Markdown, and a problem may not appear in the first conversion but could surface after several conversions. This is because minor problems can compound over time.
  • In mobile, when we click copy, we copy the Markdown version, not the HTML. This can lead to a bad user experience if the user enters something and later finds something else when editing or copying.
  • We can assume some cases, such as if the first line is a quote or header, and consider what the result will be in such cases.

In our example here:

  • The problem we have is that we deleted the line before three backticks, which means there is a problem with the code block rule. The line before could be a header or quote, or something else that can cause the three backticks to be interpreted as part of the text.
  • Additionally, we should only map the code block delimiters ``` to a code block if they appear on their own line not with other text. (I can report it as a separate problem since it's not causing this issue but it's fixed this issue by mistake )

here test, when I copy between Expensify and GitHub, shows why the new line can change markdown.

Screen.Recording.2023-05-14.at.12.23.31.PM.mov

what do you think? @cead22

@cead22
Copy link
Contributor

cead22 commented May 15, 2023

re: the in general section, it seems like we're still talking about hypotheticals, so it's hard to engage with those points without having concrete examples

re: the video you're showing, perhaps that's where the disconnect is. IMO, we care that the resulting HTML is the same in new dot, not in any other app.

@ahmdshrif
Copy link
Contributor

ahmdshrif commented May 16, 2023

@cead22 What do you think about the opposite case? when copying this case from another resource to a new dot do we care about it?
here is the example I include in GitHub and Slack :

Screen.Recording.2023-05-16.at.2.50.22.PM.mov

In Markdown, we should add the ``` in a separate line to create a code block; otherwise, it will be rendered as text. What is the reason for using different Markdown rules?

here is a test of rendering code block with and without text before triple backticks :

Screen.Recording.2023-05-16.at.3.17.10.PM.mov

@cead22
Copy link
Contributor

cead22 commented May 16, 2023

@cead22 What do you think about the opposite case? when copying this case from another resource to a new dot do we care about it?
What is the reason for using different Markdown rules?

Can you ask these in #expensify-open-source?

@tienifr
Copy link
Contributor

tienifr commented Jun 5, 2023

I encountered this issue a few days ago, it's happening with all the text, not excluded to | character. IMO it should be re-opened and fixed since it really cause confusion for users editing the message.

@0xmiros
Copy link
Contributor

0xmiros commented Jul 19, 2023

I think this should be re-opened.
Because of this inconsistency, more critical bugs happening.
i.e.

cc: @cead22 @twisterdotcom

@twisterdotcom
Copy link
Contributor

Can we wait on the outcome of the investigation in #18928? I don't think we should double up work here.

@0xmiros
Copy link
Contributor

0xmiros commented Oct 13, 2023

@twisterdotcom In #18928, they didn't fix the real root cause but applied workaround

@nikos-amofa
Copy link
Contributor

@0xmiroslav,
The proposal for #26490 fixes this issue
Is it okay to create a PR? If so, what issue can I reference? #16982 or #26490?

Screen.Recording.2023-10-13.at.12.31.46.PM.mov

@cead22
Copy link
Contributor

cead22 commented Oct 13, 2023

26490

@melvin-bot melvin-bot bot added Reviewing Has a PR in review Weekly KSv2 Awaiting Payment Auto-added when associated PR is deployed to production and removed Daily KSv2 Weekly KSv2 labels Nov 21, 2023
@melvin-bot melvin-bot bot changed the title [$2000] Newline is removed from code block with a character before it [HOLD for payment 2023-11-30] [$2000] Newline is removed from code block with a character before it Nov 23, 2023
@melvin-bot melvin-bot bot removed the Reviewing Has a PR in review label Nov 23, 2023
Copy link

melvin-bot bot commented Nov 23, 2023

Reviewing label has been removed, please complete the "BugZero Checklist".

Copy link

melvin-bot bot commented Nov 23, 2023

The solution for this issue has been 🚀 deployed to production 🚀 in version 1.4.2-3 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2023-11-30. 🎊

After the hold period is over and BZ checklist items are completed, please complete any of the applicable payments for this issue, and check them off once done.

  • External issue reporter
  • Contributor that fixed the issue
  • Contributor+ that helped on the issue and/or PR

For reference, here are some details about the assignees on this issue:

Copy link

melvin-bot bot commented Nov 23, 2023

BugZero Checklist: The PR fixing this issue has been merged! The following checklist (instructions) will need to be completed before the issue can be closed:

  • [@0xmiroslav] The PR that introduced the bug has been identified. Link to the PR:
  • [@0xmiroslav] The offending PR has been commented on, pointing out the bug it caused and why, so the author and reviewers can learn from the mistake. Link to comment:
  • [@0xmiroslav] A discussion in #expensify-bugs has been started about whether any other steps should be taken (e.g. updating the PR review checklist) in order to catch this type of bug sooner. Link to discussion:
  • [@0xmiroslav] Determine if we should create a regression test for this bug.
  • [@0xmiroslav] If we decide to create a regression test for the bug, please propose the regression test steps to ensure the same bug will not reach production again.
  • [@twisterdotcom] Link the GH issue for creating/updating the regression test once above steps have been agreed upon:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Internal Requires API changes or must be handled by Expensify staff Weekly KSv2
Projects
None yet
Development

No branches or pull requests