[NoQA] Fix reauthentication #52727

zirgulis · 2024-11-18T21:43:51Z

Explanation of Change

Fix reauthentication (@neil-marcellini) Fix re-authentication when it fails to fetch #52228
Add the improved reauth when offline test (original PR [No QA] [HOLD ON PR #52228] Improve simulating online/offline conditions in reauthentication test #52165)
Fix network tests flakiness

We were seeing a problem where the user's Auth token would expire, which triggers an Authenticate request to re-authenticate the user. However, if the user is on a bad connection it's possible this request fails to fetch. In that case we were catching the error and signing the user out, which means they could lose data in queued write requests, and it's annoying to have to sign in again.

To fix this, we add a retry mechanism with exponential backoff, using the same throttling mechanism we already have for the SequentialQueue. The throttle is now a class so we can have separate instances. If re-auth is still failing after the maximum number of retries we'll log them out, but that should be extremely rare. We verified that if the user is offline when re-authenticating that the retries are paused until they come back online, so failed to fetch errors are very unlikely to cause a log out now.

Finally, we fixed the re-auth tests and added one for this exact flow, verifying that they fail on main and only pass with the fix. Previously, the tests were invalid, false positives.

Fixed Issues

$ #51707
PROPOSAL: N/A

Tests

Merge in this PR which has changes to test the exact scenario Manual test re-auth failing to fetch #52281
Log in with any validated account
Open the JS console
Send a message in any chat
Search the console logs for Ndebug failing to fetch in Authenticate and verify that it's logged about 10 times and that Authenticate requests are made more and more rarely
Verify you are not logged out.

Verify that no errors appear in the JS console

Offline tests

Merge in this PR which has changes to test the exact scenario Manual test re-auth failing to fetch #52281
Log in with any validated account
Open the JS console
Send a message in any chat
Go offline immediately after
Search the console logs for Ndebug failing to fetch in Authenticate and verify that it's not logged repeatedly, signaling that the retry is paused while offline
Verify you are not logged out
Go online
Search the console logs for Ndebug failing to fetch in Authenticate and verify that it's logged about 10 times and that Authenticate requests are made more and more rarely
Verify you are not logged out

QA Steps

No QA, this is a very specific situation that is very hard to reproduce.

Verify that no errors appear in the JS console

PR Author Checklist

Screenshots/Videos

Tests were only run on web since changes are platform independent.

Android: Native

Android: mWeb Chrome

iOS: Native

iOS: mWeb Safari

MacOS: Chrome / Safari

Online

OnlineReauth2024-11-21_18-29-01.mp4

Offline

OfflineReAuth2024-11-21_18-33-46.mp4

MacOS: Desktop

zirgulis · 2024-11-18T21:56:02Z

I see ReportTest.ts is now failing for some reason, I will fix that tomorrow.

allgandalf

thanks for this quick work, please let me know when this one is ready for review

zirgulis · 2024-11-19T13:28:24Z

@allgandalf it seems the ReportTest.ts just timed out and with a typo fix commit it went away. Locally I was not able to make that test fail, all succeed.

allgandalf · 2024-11-19T13:30:07Z

wow, that's awesome, I also noticed that your branch is 1000+ commits behind main, it's better we merge main, can you do that please

zirgulis · 2024-11-19T13:34:26Z

@neil-marcellini Could you please share how to reproduce/test this reauth issue? I tried to reproduce this locally by having a bash script running which would periodically turn off and on my Mac's network. While doing that the local proxy server (npm run web) started to crash. Later I tried to do the same in staging env but without any luck, I didn't get logged out.

neil-marcellini

Good progress so far. Let's make sure this is rock solid before we ship it. I'll start doing some manual testing.

src/libs/Middleware/Reauthentication.ts

src/libs/Network/index.ts

tests/unit/NetworkTest.ts

neil-marcellini · 2024-11-19T15:46:17Z

tests/unit/NetworkTest.ts

Also let's please examine the other re-authentication test in this file, and make sure it's failing before any fixes from this PR. For example, I don't understand why we mock it to expect another call after failed re-authentication using the old authToken. It seems to me re-auth should fail to fetch, then get retried and succeed.

App/tests/unit/NetworkTest.ts

Lines 102 to 110 in c9cb455

// Fail the call to re-authenticate

.mockImplementationOnce(actualXhr)

// The next call should still be using the old authToken

.mockImplementationOnce(() =>

Promise.resolve({

jsonCode: CONST.JSON_CODE.NOT_AUTHENTICATED,

}),

)

It's also odd that we mock several responses but then only assert about two.

App/tests/unit/NetworkTest.ts

Lines 145 to 146 in c9cb455

expect(callsToOpenPublicProfilePage.length).toBe(1);

expect(callsToAuthenticate.length).toBe(1);

@neil-marcellini I didn't touch this test but this one is quite interesting. If we run this on main it will succeed but if we look at the logs it actually signs out the user, giving us a false positive.

Yes exactly. The test doesn't really make sense. That's why I modified it in my PR. It still wasn't working quite right for me, but maybe with your latest changes it will.

Let's update the test, verify it's passing with the fix, then cherry pick the test to main and make sure it's failing.

@neil-marcellini I can confirm that both added tests are failing on main:

neil-marcellini · 2024-11-19T15:51:45Z

@neil-marcellini Could you please share how to reproduce/test this reauth issue? I tried to reproduce this locally by having a bash script running which would periodically turn off and on my Mac's network. While doing that the local proxy server (npm run web) started to crash. Later I tried to do the same in staging env but without any luck, I didn't get logged out.

Sure, yes it's a bit tricky. I explained in this comment how I was able to manually reproduce the issue. I used a backend change to mark the auth token as expired when I add a comment, but you might be able to modify the app's network logic to mock that on the frontend. I modified app to fail to fetch once when re-authenticating.

I'll see if I can get that working and write manual test steps into this PR description.

neil-marcellini · 2024-11-19T17:02:32Z

You can merge this PR Manual test re-auth failing to fetch into this one with the fix locally, and verify if it's working. That manual test PR currently fails on main.

This reverts commit 80d634b.

neil-marcellini

Your latest changes look good. Still a few more things to update as mentioned in Slack.

neil-marcellini

Looks really great, thanks! I'll finish manually testing and then we should be good to go. Lmk if you want to address the non-blocking comments now, or in a follow up PR.

src/libs/Network/index.ts

src/libs/RequestThrottle.ts

src/libs/Network/SequentialQueue.ts

src/libs/Middleware/Reauthentication.ts

zirgulis · 2024-11-21T16:56:27Z

Looks really great, thanks! I'll finish manually testing and then we should be good to go. Lmk if you want to address the non-blocking comments now, or in a follow up PR.

Yes will do that in this PR

melvin-bot · 2024-11-21T17:05:08Z

@ Please copy/paste the Reviewer Checklist from here into a new comment on this PR and complete it. If you have the K2 extension, you can simply click: [this button]

neil-marcellini · 2024-11-21T17:10:57Z

One more thing, maybe we should have a special max retry count for re-authentication? We reach the 10 retry limit pretty quickly, and since it's quite important that we don't sign people out, I think it would be good if the re-auth max throttle retry time was about a couple minutes.

Also, it would be smart to manually test that we don't sent reauth retries while offline, only once the app is back online, otherwise it's likely they will keep failing to fetch and hit the max retry count.

tgolen

I agree with all of Neil's NAB comments, and I actually don't think they are NABs. I would like to have them be done. If not in this PR, then in a separate PR that comes right after this.

I also think we should have the separate retry limit as Neil suggests.

src/libs/Middleware/Reauthentication.ts

src/libs/Network/index.ts

zirgulis · 2024-11-21T19:36:43Z

I also think we should have the separate retry limit as Neil suggests.

@tgolen I think this is out of scope for this PR, but I'm happy to tackle this in the next PR

allgandalf · 2024-11-21T20:08:32Z

@zirgulis when i merged https://github.com/Expensify/App/pull/52281/files and tested it as mentioned in testing steps, I do get logged out:

Screen.Recording.2024-11-22.at.1.36.17.AM.mov

I even got logged out when i sent a message:

Screen.Recording.2024-11-22.at.1.38.06.AM.mov

tgolen

Thank you!

neil-marcellini

Thanks for your latest changes, it looks really solid to me now. 10 retries ends up with a pretty long delay, and I also tested it while offline and verified that the retries are paused which is great because I think it would be pretty unlikely for this to actually fail now. I updated the PR description for you and I think this is good to go.

@allgandalf it looks like maybe you didn't pull the latest changes from remote before testing, because the log lines are different now.

allgandalf · 2024-11-22T13:21:55Z

@allgandalf it looks like maybe you didn't pull the latest changes from remote before testing, because the log lines are different now.

Trying again now

allgandalf · 2024-11-22T13:35:05Z

Reviewer Checklist

I have verified the author checklist is complete (all boxes are checked off).
I verified the correct issue is linked in the ### Fixed Issues section above
I verified testing steps are clear and they cover the changes made in this PR
- I verified the steps for local testing are in the Tests section
- I verified the steps for Staging and/or Production testing are in the QA steps section
- I verified the steps cover any possible failure scenarios (i.e. verify an input displays the correct error message if the entered data is not correct)
- I turned off my network connection and tested it while offline to ensure it matches the expected behavior (i.e. verify the default avatar icon is displayed if app is offline)
I checked that screenshots or videos are included for tests on all platforms
I included screenshots or videos for tests on all platforms
I verified tests pass on all platforms & I tested again on:
- Android: Native
- Android: mWeb Chrome
- iOS: Native
- iOS: mWeb Safari
- MacOS: Chrome / Safari
- MacOS: Desktop
If there are any errors in the console that are unrelated to this PR, I either fixed them (preferred) or linked to where I reported them in Slack
I verified proper code patterns were followed (see Reviewing the code)
- I verified that any callback methods that were added or modified are named for what the method does and never what callback they handle (i.e. toggleReport and not onIconClick).
- I verified that comments were added to code that is not self explanatory
- I verified that any new or modified comments were clear, correct English, and explained "why" the code was doing something instead of only explaining "what" the code was doing.
- I verified any copy / text shown in the product is localized by adding it to src/languages/* files and using the translation method
- I verified all numbers, amounts, dates and phone numbers shown in the product are using the localization methods
- I verified any copy / text that was added to the app is grammatically correct in English. It adheres to proper capitalization guidelines (note: only the first word of header/labels should be capitalized), and is either coming verbatim from figma or has been approved by marketing (in order to get marketing approval, ask the Bug Zero team member to add the Waiting for copy label to the issue)
- I verified proper file naming conventions were followed for any new files or renamed files. All non-platform specific files are named after what they export and are not named "index.js". All platform-specific files are named for the platform the code supports as outlined in the README.
- I verified the JSDocs style guidelines (in STYLE.md) were followed
If a new code pattern is added I verified it was agreed to be used by multiple Expensify engineers
I verified that this PR follows the guidelines as stated in the Review Guidelines
I verified other components that can be impacted by these changes have been tested, and I retested again (i.e. if the PR modifies a shared library or component like Avatar, I verified the components using Avatar have been tested & I retested again)
I verified all code is DRY (the PR doesn't include any logic written more than once, with the exception of tests)
I verified any variables that can be defined as constants (ie. in CONST.js or at the top of the file that uses the constant) are defined as such
If a new component is created I verified that:
- A similar component doesn't exist in the codebase
- All props are defined accurately and each prop has a /** comment above it */
- The file is named correctly
- The component has a clear name that is non-ambiguous and the purpose of the component can be inferred from the name alone
- The only data being stored in the state is data necessary for rendering and nothing else
- For Class Components, any internal methods passed to components event handlers are bound to this properly so there are no scoping issues (i.e. for onClick={this.submit} the method this.submit should be bound to this in the constructor)
- Any internal methods bound to this are necessary to be bound (i.e. avoid this.submit = this.submit.bind(this); if this.submit is never passed to a component event handler like onClick)
- All JSX used for rendering exists in the render method
- The component has the minimum amount of code necessary for its purpose, and it is broken down into smaller components in order to separate concerns and functions
If any new file was added I verified that:
- The file has a description of what it does and/or why is needed at the top of the file if the code is not self explanatory
If a new CSS style is added I verified that:
- A similar style doesn't already exist
- The style can't be created with an existing StyleUtils function (i.e. StyleUtils.getBackgroundAndBorderStyle(theme.componentBG)
If the PR modifies code that runs when editing or sending messages, I tested and verified there is no unexpected behavior for all supported markdown - URLs, single line code, code blocks, quotes, headings, bold, strikethrough, and italic.
If the PR modifies a generic component, I tested and verified that those changes do not break usages of that component in the rest of the App (i.e. if a shared library or component like Avatar is modified, I verified that Avatar is working as expected in all cases)
If the PR modifies a component related to any of the existing Storybook stories, I tested and verified all stories for that component are still working as expected.
If the PR modifies a component or page that can be accessed by a direct deeplink, I verified that the code functions as expected when the deeplink is used - from a logged in and logged out account.
If the PR modifies the UI (e.g. new buttons, new UI components, changing the padding/spacing/sizing, moving components, etc) or modifies the form input styles:
- I verified that all the inputs inside a form are aligned with each other.
- I added Design label and/or tagged @Expensify/design so the design team can review the changes.
If a new page is added, I verified it's using the ScrollView component to make it scrollable when more elements are added to the page.
If the main branch was merged into this PR after a review, I tested again and verified the outcome was still expected according to the Test steps.
I have checked off every checkbox in the PR reviewer checklist, including those that don't apply to this PR.

Screenshots/Videos

MacOS: Chrome / Safari

Online:

Screen.Recording.2024-11-22.at.7.03.04.PM.mov

Offline:

Screen.Recording.2024-11-22.at.7.23.25.PM.mov

MacOS: Desktop

Online:

Screen.Recording.2024-11-22.at.7.26.38.PM.mov

Offline:

Screen.Recording.2024-11-22.at.7.38.52.PM.mov

Android: Native

Screen.Recording.2024-11-22.at.7.43.50.PM.mov

Android: mWeb Chrome

Screen.Recording.2024-11-22.at.7.47.48.PM.mov

iOS: Native

Screen.Recording.2024-11-22.at.7.50.39.PM.mov

iOS: mWeb Safari

Screen.Recording.2024-11-22.at.7.52.00.PM.mov

allgandalf

Tests well in both online and offline mode:

Verified that the re-authentication count stops at 10 and it slows down till it reached 10.
Verified that in offline mode, there is no call made, which means that the calls are paused until the user is online
Verified that we are not logged out in both the cases

neil-marcellini

Good to go! Thanks guys

OSBotify · 2024-11-22T14:59:22Z

✋ This PR was not deployed to staging yet because QA is ongoing. It will be automatically deployed to staging after the next production release.

github-actions · 2024-11-22T20:32:00Z

🚀 Deployed to staging by https://github.com/neil-marcellini in version: 9.0.66-0 🚀

platform	result
🤖 android 🤖	success ✅
🖥 desktop 🖥	success ✅
🍎 iOS 🍎	success ✅
🕸 web 🕸	success ✅
🤖🔄 android HybridApp 🤖🔄	success ✅
🍎🔄 iOS HybridApp 🍎🔄	success ✅

github-actions · 2024-11-26T16:13:25Z

🚀 Deployed to production by https://github.com/mountiny in version: 9.0.66-8 🚀

platform	result
🤖 android 🤖	success ✅
🖥 desktop 🖥	success ✅
🍎 iOS 🍎	success ✅
🕸 web 🕸	success ✅
🤖🔄 android HybridApp 🤖🔄	failure ❌
🍎🔄 iOS HybridApp 🍎🔄	failure ❌

neil-marcellini and others added 10 commits November 7, 2024 15:10

Fix re-authentication test that should be failing

51db850

WIP retry authentication with throttle

f275b47

Fix types

7ffdbbe

Try advancing timers past throttle

f5da095

Comment out shit causing a circular dependency

225ad10

revert reauth test

876ff5a

revert RequestThrottle to be function based

80d634b

fix flaky network tests

ff16617

add improved reauth while offline test

c9cb455

cleanup code

db95071

zirgulis mentioned this pull request Nov 18, 2024

[HOLD for payment 2024-11-14] HOLD ON PR #52228 [$250] ND app hangs at the splash screen when on a bad connection and opening after a while #51185

Open

8 tasks

allgandalf reviewed Nov 19, 2024

View reviewed changes

fix typo

2a656a8

neil-marcellini requested changes Nov 19, 2024

View reviewed changes

zirgulis added 3 commits November 20, 2024 11:58

remove redundant code

d5ace1f

Revert "revert RequestThrottle to be function based"

c17d32e

This reverts commit 80d634b.

use sequentialQueueRequestThrottle in APITest

008ec56

neil-marcellini requested changes Nov 20, 2024

View reviewed changes

neil-marcellini mentioned this pull request Nov 20, 2024

[$250] [HOLD for payment 2024-12-03] Got logged out of the app in web staging #51707

Closed

8 tasks

zirgulis added 2 commits November 21, 2024 17:45

fix reauth online test, fix RequestThrottle.clear

146fc09

fix race condition in reauth offline test

dfb863c

zirgulis requested a review from neil-marcellini November 21, 2024 15:51

neil-marcellini previously approved these changes Nov 21, 2024

View reviewed changes

melvin-bot bot removed the request for review from a team November 21, 2024 17:05

neil-marcellini requested a review from tgolen November 21, 2024 17:12

tgolen requested changes Nov 21, 2024

View reviewed changes

src/libs/Middleware/Reauthentication.ts Outdated Show resolved Hide resolved

src/libs/Network/index.ts Outdated Show resolved Hide resolved

zirgulis added 4 commits November 21, 2024 21:00

add comments to cleanup functions, fix interval type

1196f62

add name param to RequestThrottle

9ce4753

remove reauth onyx key

bf399e6

improve comments, remove unused import

463fbe3

zirgulis dismissed neil-marcellini’s stale review via 463fbe3 November 21, 2024 19:32

zirgulis requested review from tgolen and neil-marcellini November 21, 2024 19:40

Merge branch 'Expensify:main' into fix-reauthentication

96f986d

tgolen approved these changes Nov 21, 2024

View reviewed changes

neil-marcellini approved these changes Nov 21, 2024

View reviewed changes

allgandalf approved these changes Nov 22, 2024

View reviewed changes

melvin-bot bot requested a review from neil-marcellini November 22, 2024 14:24

neil-marcellini approved these changes Nov 22, 2024

View reviewed changes

neil-marcellini merged commit 8d69d60 into Expensify:main Nov 22, 2024
22 of 24 checks passed

github-actions bot mentioned this pull request Nov 22, 2024

Deploy Checklist: New Expensify 2024-11-22 #52978

Closed

75 tasks

neil-marcellini mentioned this pull request Nov 22, 2024

Fix re-authentication when it fails to fetch #52228

Merged

49 tasks

zirgulis mentioned this pull request Nov 25, 2024

[No QA] [HOLD ON PR #52228] Improve simulating online/offline conditions in reauthentication test #52165

Closed

49 tasks

muttmuure mentioned this pull request Nov 25, 2024

[Hold #51707] Logged out on HybridApp after not using the app for ~12 hours #50358

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NoQA] Fix reauthentication #52727

[NoQA] Fix reauthentication #52727

zirgulis commented Nov 18, 2024 •

edited by neil-marcellini

Loading

zirgulis commented Nov 18, 2024

allgandalf left a comment

zirgulis commented Nov 19, 2024

allgandalf commented Nov 19, 2024

zirgulis commented Nov 19, 2024

neil-marcellini left a comment

neil-marcellini Nov 19, 2024

zirgulis Nov 20, 2024

neil-marcellini Nov 20, 2024

zirgulis Nov 21, 2024

neil-marcellini commented Nov 19, 2024

neil-marcellini commented Nov 19, 2024 •

edited

Loading

neil-marcellini left a comment

neil-marcellini left a comment

zirgulis commented Nov 21, 2024

melvin-bot bot commented Nov 21, 2024

neil-marcellini commented Nov 21, 2024

tgolen left a comment

zirgulis commented Nov 21, 2024

allgandalf commented Nov 21, 2024

tgolen left a comment

neil-marcellini left a comment

allgandalf commented Nov 22, 2024

allgandalf commented Nov 22, 2024 •

edited

Loading

allgandalf left a comment •

edited

Loading

neil-marcellini left a comment

OSBotify commented Nov 22, 2024

github-actions bot commented Nov 22, 2024

github-actions bot commented Nov 26, 2024

	// Fail the call to re-authenticate
	.mockImplementationOnce(actualXhr)

	// The next call should still be using the old authToken
	.mockImplementationOnce(() =>
	Promise.resolve({
	jsonCode: CONST.JSON_CODE.NOT_AUTHENTICATED,
	}),
	)

	expect(callsToOpenPublicProfilePage.length).toBe(1);
	expect(callsToAuthenticate.length).toBe(1);

[NoQA] Fix reauthentication #52727

[NoQA] Fix reauthentication #52727

Conversation

zirgulis commented Nov 18, 2024 • edited by neil-marcellini Loading

Explanation of Change

Fixed Issues

Tests

Offline tests

QA Steps

PR Author Checklist

Screenshots/Videos

zirgulis commented Nov 18, 2024

allgandalf left a comment

Choose a reason for hiding this comment

zirgulis commented Nov 19, 2024

allgandalf commented Nov 19, 2024

zirgulis commented Nov 19, 2024

neil-marcellini left a comment

Choose a reason for hiding this comment

neil-marcellini Nov 19, 2024

Choose a reason for hiding this comment

zirgulis Nov 20, 2024

Choose a reason for hiding this comment

neil-marcellini Nov 20, 2024

Choose a reason for hiding this comment

zirgulis Nov 21, 2024

Choose a reason for hiding this comment

neil-marcellini commented Nov 19, 2024

neil-marcellini commented Nov 19, 2024 • edited Loading

neil-marcellini left a comment

Choose a reason for hiding this comment

neil-marcellini left a comment

Choose a reason for hiding this comment

zirgulis commented Nov 21, 2024

melvin-bot bot commented Nov 21, 2024

neil-marcellini commented Nov 21, 2024

tgolen left a comment

Choose a reason for hiding this comment

zirgulis commented Nov 21, 2024

allgandalf commented Nov 21, 2024

tgolen left a comment

Choose a reason for hiding this comment

neil-marcellini left a comment

Choose a reason for hiding this comment

allgandalf commented Nov 22, 2024

allgandalf commented Nov 22, 2024 • edited Loading

Reviewer Checklist

Screenshots/Videos

allgandalf left a comment • edited Loading

Choose a reason for hiding this comment

neil-marcellini left a comment

Choose a reason for hiding this comment

OSBotify commented Nov 22, 2024

github-actions bot commented Nov 22, 2024

github-actions bot commented Nov 26, 2024

zirgulis commented Nov 18, 2024 •

edited by neil-marcellini

Loading

neil-marcellini commented Nov 19, 2024 •

edited

Loading

allgandalf commented Nov 22, 2024 •

edited

Loading

allgandalf left a comment •

edited

Loading