Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOLD for payment 2023-11-06] [$500] Chat - Email pattern is not recognized when sending a comment #28629

Closed
6 tasks done
lanitochka17 opened this issue Oct 2, 2023 · 46 comments
Assignees
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Daily KSv2 Engineering External Added to denote the issue can be worked on by a contributor

Comments

@lanitochka17
Copy link

lanitochka17 commented Oct 2, 2023

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!


Action Performed:

  1. Go to any report
  2. Type in compose box codeemail@gmail.com (without any space)
  3. Notice the email is not recognized as email link

Expected Result:

Email pattern should be recognized and become blue link style

Actual Result:

Email pattern is not recognized

Workaround:

Unknown

Platforms:

Which of our officially supported platforms is this issue occurring on?

  • Android / native
  • Android / Chrome
  • iOS / native
  • iOS / Safari
  • MacOS / Chrome / Safari
  • MacOS / Desktop

Version Number: 1.3.75-11

Reproducible in staging?: Yes

Reproducible in production?: Yes

If this was caught during regression testing, add the test name, ID and link from TestRail:

Email or phone of affected tester (no customers):

Logs: https://stackoverflow.com/c/expensify/questions/4856

Notes/Photos/Videos: Any additional supporting documentation

ios-safari_d.mp4
macos-desktop.mov
mweb_d.mp4
android-native_d.mp4
ios-native_d.mp4
macos-web_d.mp4
Recording.124.mp4

Expensify/Expensify Issue URL:

Issue reported by: @tsa321

Slack conversation: https://expensify.slack.com/archives/C049HHMV9SM/p1696219306455269

View all open jobs on GitHub

Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~0181685d1a16ff4d31
  • Upwork Job ID: 1708873970110574592
  • Last Price Increase: 2023-10-16
  • Automatic offers:
    • eh2077 | Contributor | 27335685
    • tsa321 | Reporter | 27335687
@lanitochka17 lanitochka17 added External Added to denote the issue can be worked on by a contributor Daily KSv2 Bug Something is broken. Auto assigns a BugZero manager. labels Oct 2, 2023
@melvin-bot melvin-bot bot changed the title Chat - Email pattern is not recognized when sending a comment [$500] Chat - Email pattern is not recognized when sending a comment Oct 2, 2023
@melvin-bot
Copy link

melvin-bot bot commented Oct 2, 2023

Job added to Upwork: https://www.upwork.com/jobs/~0181685d1a16ff4d31

@melvin-bot
Copy link

melvin-bot bot commented Oct 2, 2023

Triggered auto assignment to @Christinadobrzyn (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details.

@melvin-bot melvin-bot bot added the Help Wanted Apply this label when an issue is open to proposals by contributors label Oct 2, 2023
@melvin-bot
Copy link

melvin-bot bot commented Oct 2, 2023

Bug0 Triage Checklist (Main S/O)

  • This "bug" occurs on a supported platform (ensure Platforms in OP are ✅)
  • This bug is not a duplicate report (check E/App issues and #expensify-bugs)
    • If it is, comment with a link to the original report, close the issue and add any novel details to the original issue instead
  • This bug is reproducible using the reproduction steps in the OP. S/O
    • If the reproduction steps are clear and you're unable to reproduce the bug, check with the reporter and QA first, then close the issue.
    • If the reproduction steps aren't clear and you determine the correct steps, please update the OP.
  • This issue is filled out as thoroughly and clearly as possible
    • Pay special attention to the title, results, platforms where the bug occurs, and if the bug happens on staging/production.
  • I have reviewed and subscribed to the linked Slack conversation to ensure Slack/Github stay in sync

@melvin-bot
Copy link

melvin-bot bot commented Oct 2, 2023

Triggered auto assignment to Contributor-plus team member for initial proposal review - @jjcoffee (External)

@Christinadobrzyn
Copy link
Contributor

I can reproduce this. Keeping External label.

@ZhenjaHorbach
Copy link
Contributor

ZhenjaHorbach commented Oct 2, 2023

Proposal

Please re-state the problem that we are trying to solve in this issue

Email pattern is not recognized and the email is not always displayed correctly after the tags (<code> in this case)

What is the root cause of that problem?

Imperfect regex for autoEmail which we use in common library

https://github.com/Expensify/expensify-common/blob/ab4895807dd9a26f64bfaee80db15ee2c48a5124/lib/ExpensiMark.js#L127-L135

What changes do you think we should make in order to solve the problem?

We can update regex for autoEmail
For example:

https://github.com/Expensify/expensify-common/blob/ab4895807dd9a26f64bfaee80db15ee2c48a5124/lib/ExpensiMark.js#L130-L133

(?![^<]>|[^<>]<\/(?!em)) ([^\\w'#%+-\`]|||)${CONST.REG_EXP.MARKDOWN_EMAIL}(?!((?:(?!<a).)+)?<\/a>|[^<]*(<\/pre>|<\/code>))

Screen.Recording.2023-10-02.at.19.23.32.mov

@akamefi202
Copy link
Contributor

akamefi202 commented Oct 2, 2023

Proposal

Please re-state the problem that we are trying to solve in this issue.

Email pattern is not recognized when sending a comment with inline code & email.

What is the root cause of that problem?

If we send any comment, ExpensiMark class of expensify-common repo translates the comment into html text.

`code`test@test.com

If the user sends above comment, inlineCodeBlock regex processes the text first.

<code>code</code>test@test.com

Then the regex for autoEmail blocks replacement if the comment contains any tag just before email, only except <em> tag.

(?![^<]*>|[^<>]*<\\/(?!em))([^\\w'#%+-]|^|<em>)${CONST.REG_EXP.MARKDOWN_EMAIL}(?!((?:(?!<a).)+)?<\\/a>|[^<]*(<\\/pre>|<\\/code>))

https://github.com/Expensify/expensify-common/blob/ab4895807dd9a26f64bfaee80db15ee2c48a5124/lib/ExpensiMark.js#L130-L134

What changes do you think we should make in order to solve the problem?

We need to update the regex for autoEmail to allow </code> & </pre> tag just before email.

(?![^<]*>|[^<>]*<\\/(?!em|code|pre))([^\\w'#%+-]|^|<em>|<\\/code>|<\\/pre>)${CONST.REG_EXP.MARKDOWN_EMAIL}(?!((?:(?!<a).)+)?<\\/a>|[^<]*(<\\/pre>|<\\/code>))

I suggest to replace <\\/(?!em) with <\\/(?!em|code|pre).
In other words, replace (?![^<]*>|[^<>]*<\\/(?!em)) with (?![^<]*>|[^<>]*<\\/(?!em|code|pre)).

Previous autoEmail regex blocks replacement if the comment contains any closing tag just before email, except </em> tag.
But the updated regex will allow replacement if the comment contains </em>, </pre>, </code> tag just before email.

`code`test@test.com => <code>code</code>test@test.com

Previous regex works like above so the email pattern is not recognized.

`code`test@test.com => <code>code</code><a href="mailto:test@test.com">test@test.com</a>

But updated regex will work like above so the email pattern will be recognized successfully.

I've confirmed that this fix is working well without any regression by jest testing.

What alternative solutions did you explore? (Optional)

N/A

@melvin-bot melvin-bot bot added the Overdue label Oct 5, 2023
@Christinadobrzyn
Copy link
Contributor

reviewing proposals

@melvin-bot melvin-bot bot added Overdue and removed Overdue labels Oct 5, 2023
@melvin-bot
Copy link

melvin-bot bot commented Oct 9, 2023

@jjcoffee, @Christinadobrzyn Uh oh! This issue is overdue by 2 days. Don't forget to update your issues!

@jjcoffee
Copy link
Contributor

jjcoffee commented Oct 9, 2023

@ZhenjaHorbach @akamefi202 Could you add extra detail to your proposals explaining what you're proposing to add to the regex and why it works?

@melvin-bot melvin-bot bot removed the Overdue label Oct 9, 2023
@melvin-bot
Copy link

melvin-bot bot commented Oct 9, 2023

📣 It's been a week! Do we have any satisfactory proposals yet? Do we need to adjust the bounty for this issue? 💸

@akamefi202
Copy link
Contributor

@jjcoffee I updated the proposal with more details. Please review again.
#28629 (comment)

@melvin-bot melvin-bot bot added the Overdue label Oct 12, 2023
@Christinadobrzyn
Copy link
Contributor

hey @jjcoffee could you check out the revised proposal #28629 (comment)

Or let me know if you'd like to see more proposals.

@melvin-bot melvin-bot bot removed the Overdue label Oct 12, 2023
@jjcoffee
Copy link
Contributor

@Christinadobrzyn Sorry I haven't had a chance to test the updated proposal properly yet. I will add it to my list for Monday!

@eh2077
Copy link
Contributor

eh2077 commented Oct 15, 2023

Proposal

Please re-state the problem that we are trying to solve in this issue.

Email pattern is not recognized when sending a comment.

What is the root cause of that problem?

Click to see RCA

The root cause of this issue is that the regex of autoEmail

`(?![^<]*>|[^<>]*<\\/(?!em))([^\\w'#%+-]|^|<em>)${CONST.REG_EXP.MARKDOWN_EMAIL}(?!((?:(?!<a).)+)?<\\/a>|[^<]*(<\\/pre>|<\\/code>))`,

can't match following intermediate text(with inlineCodeBlock applied)

<code>code</code>test@gmail.com

In the starting negative lookahead group, (?![^<]*>|[^<>]*<\/(?!em)), the first alternative, [^<]*>, skips to match substring, >test@gmail.com, because it matches the first character >. So, the email is not rendered as anchor tag.

What changes do you think we should make in order to solve the problem?

Click to see the main solution with minimum change

Based on the RCA above, we know that the regex piece [^<]*> of the starting negative lookahead group is to skip match text inside a tag, like inside the opening tag of anchor tag

<a href="https://staging.new.expensify.com/details/test@expensify.com" target="_blank" rel="noreferrer noopener">https://staging.new.expensify.com/details/test@expensify.com</a>

The match we want to skip should contains multiple characters of [^<].

To this issue, we can improve the regex piece [^<]*> to [^<]+> and the new regex will be

`(?![^<]+>|[^<>]*<\\/(?!em))([^\\w'#%+-]|^|<em>)${CONST.REG_EXP.MARKDOWN_EMAIL}(?!((?:(?!<a).)+)?<\\/a>|[^<]*(<\\/pre>|<\\/code>))`,

And we can pass all test cases of expensify-common lib.

What alternative solutions did you explore? (Optional)

If we dive deeper into the regex

`(?![^<]*>|[^<>]*<\\/(?!em))([^\\w'#%+-]|^|<em>)${CONST.REG_EXP.MARKDOWN_EMAIL}(?!((?:(?!<a).)+)?<\\/a>|[^<]*(<\\/pre>|<\\/code>))`,

we find that the starting negative lookadhead group is redundant because the ending negative lookahead group is already sufficient to skip invalid match. For example, the alternative group, ((?:(?!<a).)+)?<\\/a>, skips to match email both the href property part and text part.

Let's split the regex

`(?![^<]*>|[^<>]*<\\/(?!em))([^\\w'#%+-]|^|<em>)${CONST.REG_EXP.MARKDOWN_EMAIL}(?!((?:(?!<a).)+)?<\\/a>|[^<]*(<\\/pre>|<\\/code>))`,

into 4 parts and discuss them one by one.

  1. The starting negative lookahead group, (?![^<]*>|[^<>]*<\\/(?!em)), which includes two alternative groups

    1. The first alternative group, [^<]*>, is to check if the match is inside angle brackets. If yes, then skip the match. Currently, only <a> tag is possible to include email inside its opening tag. This case can be handled by the 4th part, see explanation below. So, we can remove it to simplify the regex.

    2. The second alternative group, [^<>]*<\\/(?!em), is to check if the match is inside a tag and the tag name is not em. If yes, then skip the match, which means it accepts the matched email inside <em> tag. Note that the italic rule is applied before autoEmail rule. This regex piece provides similar function as the piece, [^<]*(<\\/pre>|<\\/code>), from the ending negative lookahead group. Although it's not that straightforward, it can skip email inside <pre>, <code> and <mention-user> tags. So, we can use it to simplify the regex. See below example

      image
  2. The first capturing group, ([^\\w'#%+-]|^|<em>), which is used as an anchor point for subsequent email capturing group to limit the length of email. There're three alternatives in the group

    1. [^\\w'#%+-] is an illegal character of email
    2. ^ is the position at the start of a line
    3. <em> matches the opening tag of em. This alternative is redundant because the closing angle bracket is already a character of [^\\w'#%+-]. So, we can remove it to simplify the regex.
  3. The second capturing group to capture email, ${CONST.REG_EXP.MARKDOWN_EMAIL}. The email regex is (?=((?=[\w'#%+-]+(?:\.[\w'#%+-]+)*@)[\w\.'#%+-]{1,64}@(?:(?=[a-z\d]+(?:-+[a-z\d]+)*\.)(?:[a-z\d-]{1,63}\.)+[a-z]{2,63})(?= |_|\b))(?<end>.*))\S{3,254}(?=\k<end>$).

  4. The ending negative lookahead group, (?!((?:(?!<a).)+)?<\\/a>|[^<]*(<\\/pre>|<\\/code>)), which includes two alternative groups

    1. The first alternative group, (?:(?!<a).)+)?<\\/a>, is to check if the match is inside an <a> tag. If yes, then skip the match. Why don't use similar regex, like [^<]*(<\\/a>|<\\/pre>|<\\/code>), to handle <a> tag? It's because <a> can include children tags, like <em>, <stong> and <del> tags. It's able to skip matching email from both the href property part and text part, see below example

      image
    2. The second alternative group, [^<]*(<\\/pre>|<\\/code>), is to check if the match is inside a <pre> or <code> tag. We can use the more concise regex piece, [^<>]*<\\/(?!em), from part 1.

Combined the above discussion, we can have the simplified regex

`([^\\w'#%+-]|^)${CONST.REG_EXP.MARKDOWN_EMAIL}(?!((?:(?!<a).)+)?<\\/a>|[^<>]*<\\/(?!em))`

See regex test example

Click to see screenshot image

compared the current buggy regex

Click to see screenshot image
Click to see test input and regex

Optimised regex

/([^\w'#%+-]|^)(?=((?=[\w'#%+-]+(?:\.[\w'#%+-]+)*@)[\w\.'#%+-]{1,64}@(?:(?=[a-z\d]+(?:-+[a-z\d]+)*\.)(?:[a-z\d-]{1,63}\.)+[a-z]{2,63})(?= |_|\b))(?<end>.*))\S{3,254}(?=\k<end>$)(?!((?:(?!<a).)+)?<\/a>|[^<>]*<\/(?!em))/gim

Current regex

/(?![^<]*>|[^<>]*<\/(?!em))([^\w'#%+-]|^|<em>)(?=((?=[\w'#%+-]+(?:\.[\w'#%+-]+)*@)[\w\.'#%+-]{1,64}@(?:(?=[a-z\d]+(?:-+[a-z\d]+)*\.)(?:[a-z\d-]{1,63}\.)+[a-z]{2,63})(?= |_|\b))(?<end>.*))\S{3,254}(?=\k<end>$)(?!((?:(?!<a).)+)?<\/a>|[^<]*(<\/pre>|<\/code>))/gim

Text input

Test match email after tags
<code>code</code>test@gmail.com
<pre>code block</pre>test@gmail.com
<a href="https://google.com" target="_blank" rel="noreferrer noopener">Google</a>test@gmail.com

Test match email inside em tag
<em>test@gmail.com</em>
<em>test

test@gmail.com</em>

Test skip match email inside tags
<code>test@expensify.com</code>
<pre>test@expensify.com</pre>
<mention-user>@test@expensify.com</mention-user>
<em><mention-user>@username@expensify.com</mention-user></em>

Test skip match email inside anchor tag
<a href="https://staging.new.expensify.com/details/test@expensify.com" target="_blank" rel="noreferrer noopener">https://staging.new.expensify.com/details/test@expensify.com</a>
<a href="https://google.com" target="_blank" rel="noreferrer noopener">test italic style wrap email <em>test@gmail.com</em> inside a link</a>

Markdown input

Test match email after tags
`code`test@gmail.com
```code block```test@gmail.com
[Google](https://google.com)test@gmail.com

Test match email inside em tag
_test@gmail.com_
_test

test@gmail.com_

Test skip match email inside tags
`test@expensify.com`
```test@expensify.com```
@test@expensify.com
_@username@expensify.com_

Test skip match email inside anchor tag
[https://staging.new.expensify.com/details/test@expensify.com](https://staging.new.expensify.com/details/test@expensify.com)
[test italic style wrap email _test@gmail.com_ inside a link](https://google.com)

The optimised regex can fix the issue and pass all tests of Expensify-common.

Click to see video demo

Click to see demo video
0-web.mp4

@melvin-bot melvin-bot bot added the Overdue label Oct 16, 2023
@melvin-bot
Copy link

melvin-bot bot commented Oct 16, 2023

@jjcoffee @Christinadobrzyn this issue was created 2 weeks ago. Are we close to approving a proposal? If not, what's blocking us from getting this issue assigned? Don't hesitate to create a thread in #expensify-open-source to align faster in real time. Thanks!

@melvin-bot
Copy link

melvin-bot bot commented Oct 23, 2023

📣 @eh2077 🎉 An offer has been automatically sent to your Upwork account for the Contributor role 🎉 Thanks for contributing to the Expensify app!

Offer link
Upwork job
Please accept the offer and leave a comment on the Github issue letting us know when we can expect a PR to be ready for review 🧑‍💻
Keep in mind: Code of Conduct | Contributing 📖

@melvin-bot
Copy link

melvin-bot bot commented Oct 23, 2023

📣 @tsa321 🎉 An offer has been automatically sent to your Upwork account for the Reporter role 🎉 Thanks for contributing to the Expensify app!

Offer link
Upwork job

@eh2077
Copy link
Contributor

eh2077 commented Oct 24, 2023

@burczu The PR Expensify/expensify-common#590 of expensify-common is ready for review, thanks!

cc @neil-marcellini

@melvin-bot melvin-bot bot added Reviewing Has a PR in review Weekly KSv2 and removed Daily KSv2 labels Oct 24, 2023
@eh2077
Copy link
Contributor

eh2077 commented Oct 24, 2023

The PR Expensify/expensify-common#590 has been approved and merged by the default assigned reviewer.

@burczu I raised PR #30260 for App to bump the version of expensify-common. Please help to review it when you're free, thanks!

@melvin-bot
Copy link

melvin-bot bot commented Oct 25, 2023

🎯 ⚡️ Woah @burczu / @eh2077, great job pushing this forwards! ⚡️

The pull request got merged within 3 working days of assignment, so this job is eligible for a 50% #urgency bonus 🎉

  • when @eh2077 got assigned: 2023-10-23 18:40:28 Z
  • when the PR got merged: 2023-10-25 17:48:03 UTC

On to the next one 🚀

@melvin-bot melvin-bot bot added Weekly KSv2 Awaiting Payment Auto-added when associated PR is deployed to production and removed Weekly KSv2 labels Oct 30, 2023
@melvin-bot melvin-bot bot changed the title [$500] Chat - Email pattern is not recognized when sending a comment [HOLD for payment 2023-11-06] [$500] Chat - Email pattern is not recognized when sending a comment Oct 30, 2023
@melvin-bot melvin-bot bot removed the Reviewing Has a PR in review label Oct 30, 2023
@melvin-bot
Copy link

melvin-bot bot commented Oct 30, 2023

Reviewing label has been removed, please complete the "BugZero Checklist".

@melvin-bot
Copy link

melvin-bot bot commented Oct 30, 2023

The solution for this issue has been 🚀 deployed to production 🚀 in version 1.3.92-4 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2023-11-06. 🎊

After the hold period is over and BZ checklist items are completed, please complete any of the applicable payments for this issue, and check them off once done.

  • External issue reporter
  • Contributor that fixed the issue
  • Contributor+ that helped on the issue and/or PR

For reference, here are some details about the assignees on this issue:

@melvin-bot

This comment was marked as outdated.

@burczu
Copy link
Contributor

burczu commented Oct 31, 2023

BugZero Checklist: The PR fixing this issue has been merged! The following checklist (instructions) will need to be completed before the issue can be closed:

  • [@burczu] The PR that introduced the bug has been identified. Link to the PR: Didn't find any.
  • [@burczu] The offending PR has been commented on, pointing out the bug it caused and why, so the author and reviewers can learn from the mistake. Link to comment: n/a
  • [@burczu] A discussion in #expensify-bugs has been started about whether any other steps should be taken (e.g. updating the PR review checklist) in order to catch this type of bug sooner. Link to discussion: n/a
  • [@burczu] Determine if we should create a regression test for this bug. I don't think we need regression tests here - the issue didn't break any crucial flows of the App.
  • [@burczu] If we decide to create a regression test for the bug, please propose the regression test steps to ensure the same bug will not reach production again.
  • [@Christinadobrzyn] Link the GH issue for creating/updating the regression test once above steps have been agreed upon:

@Christinadobrzyn
Copy link
Contributor

hey @burczu or @eh2077, do you have a regression test?

@burczu
Copy link
Contributor

burczu commented Nov 7, 2023

@Christinadobrzyn As I wrote in my previous comment:

I don't think we need regression tests here - the issue didn't break any crucial flows of the App.

@Christinadobrzyn
Copy link
Contributor

Payouts due:

Issue Reporter: $50 @tsa321 (paid through Upwork)
Contributor: $500 + $250 @burczu (no payment since they are a contractor)
Contributor+: $500 + $250 @eh2077 (paid through Upwork)

Eligible for 50% #urgency bonus? Y - based on #28629 (comment)

Upwork job is here.

@Christinadobrzyn
Copy link
Contributor

Paid based on this payment structure. Closing this GH as complete

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Daily KSv2 Engineering External Added to denote the issue can be worked on by a contributor
Projects
None yet
Development

No branches or pull requests

8 participants