[$250] Chat - Group is created with duplicate user when chat is started quickly on a slow network #49931

lanitochka17 · 2024-09-30T13:39:50Z

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!

Version Number: 9.0.41-2
Reproducible in staging?: Y
Reproducible in production?: Y
If this was caught during regression testing, add the test name, ID and link from TestRail: N/A
Email or phone of affected tester (no customers): biruknew45+1195@gmail.com
Issue reported by: Applause - Internal Team

Action Performed:

Go to https://staging.new.expensify.com/.
Open the Developer Tools and go to the Network tab, then set the network speed to 3G
Click on the FAB > Start Chat
Enter a user's email, then quickly click on it before the fallback avatar changes to the correct user avatar
Send a message

Expected Result:

The chat with the selected user should remain a 1-on-1 chat. The chat should not change into a group with duplicate users

Actual Result:

After sending the message, the chat turns into a group with duplicate users. It only corrects itself back to a single chat after switching to another chat and then returning to it

Workaround:

Unknown

Platforms:

Which of our officially supported platforms is this issue occurring on?

Screenshots/Videos

Add any screenshot/video evidence

Bug6619285_1727614560806.1.mp4

View all open jobs on GitHub

Upwork Automation - Do Not Edit

Upwork Job URL: https://www.upwork.com/jobs/~021843049310841478139
Upwork Job ID: 1843049310841478139
Last Price Increase: 2024-10-13
Automatic offers:

DylanDylann | Contributor | 104420901
nkdengineer | Contributor | 104651259

Issue Owner

Current Issue Owner: @RachCHopkins

melvin-bot · 2024-09-30T13:40:00Z

Triggered auto assignment to @RachCHopkins (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details. Please add this bug to a GH project, as outlined in the SO.

RachCHopkins · 2024-10-01T01:30:16Z

I don't understand what's happening in that video. I can't replicate this.

I'm on 2 bars of 4G.

Is the person reporting it saying that the two users other than themselves are, in fact, the same user?

RachCHopkins · 2024-10-01T01:30:33Z

Please clarify the "actual result"

RachCHopkins · 2024-10-04T05:45:46Z

Waiting on a response/clarification from QA team.

mvtglobally · 2024-10-04T13:33:22Z

team is still able to reproduce

group.chat.mp4

RachCHopkins · 2024-10-06T22:02:30Z

Ok, to sum up, it can be reproduced and we don't know what/who the third user is.

melvin-bot · 2024-10-06T22:02:54Z

Job added to Upwork: https://www.upwork.com/jobs/~021843049310841478139

melvin-bot · 2024-10-06T22:02:59Z

Triggered auto assignment to Contributor-plus team member for initial proposal review - @parasharrajat (External)

nkdengineer · 2024-10-07T05:06:20Z

Edited by proposal-police: This proposal was edited at 2024-10-07T12:12:00Z.

Proposal

Please re-state the problem that we are trying to solve in this issue.

After sending the message, the chat turns into a group with duplicate users. It only corrects itself back to a single chat after switching to another chat and then returning to it

What is the root cause of that problem?

When OpenReport API is called, the success data is added to queuedOnyxUpdates and waited for all write requests to be complete before merging it to Onyx

App/src/libs/actions/QueuedOnyxUpdates.ts

Lines 11 to 15 in 04214cd

    
           function queueOnyxUpdates(updates: OnyxUpdate[]): Promise<void> { 
        
               queuedOnyxUpdates = queuedOnyxUpdates.concat(updates); 
        
               return Promise.resolve(); 
        
           }

We've had the logic to flush the queue after all requests are complete here

App/src/libs/Network/SequentialQueue.ts

Lines 172 to 174 in 04214cd

    
           // The queue can be paused when we sync the data with backend so we should only update the Onyx data when the queue is empty 
        
           if (PersistedRequests.getAll().length === 0) { 
        
               flushOnyxUpdatesQueue();

But in this case after AddComment API is called, the queue is paused

App/src/libs/actions/OnyxUpdates.ts

Lines 147 to 153 in 04214cd

    
           function saveUpdateInformation(updateParams: OnyxUpdatesFromServer) { 
        
               // If we got here, that means we are missing some updates on our local storage. To 
        
               // guarantee that we're not fetching more updates before our local data is up to date, 
        
               // let's stop the sequential queue from running until we're done catching up. 
        
               SequentialQueue.pause(); 
        
               // Always use set() here so that the updateParams are never merged and always unique to the request that came in

After GetMissingOnyxMessages API, the queue is active again but the queue is empty then the queuedOnyxUpdates isn't flush which causes the participant of the DM to have both optimistic and current participants of a user.

What changes do you think we should make in order to solve the problem?

In unpause function, we should flush the queuedOnyxUpdates if there is no request . We should update after this line.

I notice that we should only flush the queuedOnyxUpdates if only numberOfPersistedRequests is 0 because if we merge these data to Onyx when there are some pending requests in the queue that can cause the race condition since other requests can update the data for the same key of the previous request.

if (numberOfPersistedRequests === 0) {
    flushOnyxUpdatesQueue();
}

App/src/libs/Network/SequentialQueue.ts

Line 192 in 04214cd

isQueuePaused = false;

For the reason why GetMissingOnyxMessages API is called I think maybe it's expected from BE. But after all, the frontend problem can happen in other cases and we need to fix this and it also fixes this issue.

What alternative solutions did you explore? (Optional)

c3024 · 2024-10-07T16:06:57Z

Proposal

Please re-state the problem that we are trying to solve in this issue.

On slow networks, if we create a report and comment on it, a group with a duplicate chat is created.

What is the root cause of that problem?

For write requests, we do not write the response onyxData and the corresponding successData/failureData/finallyData to Onyx directly. Instead, we write them to queuedOnyxUpdates to prevent the replay effect.

App/src/libs/actions/OnyxUpdates.ts

Lines 30 to 32 in 78e484f

    
           // For most requests we can immediately update Onyx. For write requests we queue the updates and apply them after the sequential queue has flushed to prevent a replay effect in 
        
           // the UI. See https://github.com/Expensify/App/issues/12775 for more info. 
        
           const updateHandler: (updates: OnyxUpdate[]) => Promise<unknown> = request?.data?.apiRequestType === CONST.API_REQUEST_TYPE.WRITE ? QueuedOnyxUpdates.queueOnyxUpdates : Onyx.update;

These updates are flushed only after all the PersistedRequests are cleared.

App/src/libs/Network/SequentialQueue.ts

Lines 173 to 175 in 78e484f

    
           if (PersistedRequests.getAll().length === 0) { 
        
               flushOnyxUpdatesQueue(); 
        
           }

We remove a request from persistent requests only after the response is received and processed with all middlewares.

App/src/libs/Network/SequentialQueue.ts

Lines 90 to 99 in 78e484f

    
           currentRequestPromise = Request.processWithMiddleware(requestToProcess, true) 
        
               .then((response) => { 
        
                   // A response might indicate that the queue should be paused. This happens when a gap in onyx updates is detected between the client and the server and 
        
                   // that gap needs resolved before the queue can continue. 
        
                   if (response?.shouldPauseQueue) { 
        
                       Log.info("[SequentialQueue] Handled 'shouldPauseQueue' in response. Pausing the queue."); 
        
                       pause(); 
        
                   } 
        
                   PersistedRequests.remove(requestToProcess);

So, on slow networks, when we call OpenReport, its response and corresponding request’s successData, etc., are added to queuedOnyxUpdates. This cannot be flushed immediately because there was already an AddComment request added to the PersistedRequests.
For some reason that needs to be fixed, the backend sends a previousUpdateID for AddComment that is larger than the previous lastUpdateID/lastClientUpdateID. This makes doesClientNeedToBeUpdated true.

App/src/libs/Middleware/SaveResponseInOnyx.ts

Line 34 in 78e484f

    
           if (requestsToIgnoreLastUpdateID.includes(request.command) || !OnyxUpdates.doesClientNeedToBeUpdated(Number(response?.previousUpdateID ?? 0))) {

As a result, the AddComment response is not processed directly with OnyxUpdates.apply. It sets this AddComment response to saveUpdateInformation, and the queue is paused. When the queue is paused, the updates waiting in queuedUpdates are not flushed, so OpenReport’s successData waits in the queuedUpdates, and the optimistic user detail is not removed, making the report appear like a group report.

What changes do you think we should make in order to solve the problem?

First, it needs to be checked why the previousUpdateID for the AddComment response is sent as a higher value than the OpenReport call’s lastUpdatedID/lastClientUpdateID. Fixing this issue should resolve the problem.
However, even with an accidental call to GetMissingOnyxMessages, the previous updates should not be left in limbo without being updated to Onyx until a new write request is made. The updates in the queue need to be written to Onyx after all the missing updates are obtained with GetMissingOnyxMessages. Currently, this is not handled anywhere. When the next write request gets completed and the persistent requests array is empty, all queuedOnyxMessages get updated.
We can see this in the present bug as well: if we send another message, everything updates correctly (the backend also does not send an incorrect previousUpdateID) because the queue is unpaused. After processing this request, when the persisted requests are empty, the queuedOnyxUpdates (which include the previous OpenReport and AddComment responses’ onyxData, successData, etc.) are flushed.
To achieve this, after fetching the missing messages, we should flush the queue finally by changing this:

App/src/libs/actions/OnyxUpdateManager/index.ts

Line 144 in 78e484f

DeferredOnyxUpdates.getMissingOnyxUpdatesQueryPromise()?.finally(finalizeUpdatesAndResumeQueue);

to:

DeferredOnyxUpdates.getMissingOnyxUpdatesQueryPromise()?.finally(finalizeUpdatesAndResumeQueue).then(() => QueuedOnyxUpdates.flushQueue());

What alternative solutions did you explore? (Optional)

melvin-bot · 2024-10-10T18:11:28Z

@parasharrajat, @RachCHopkins Whoops! This issue is 2 days overdue. Let's get this updated quick!

RachCHopkins · 2024-10-11T00:46:05Z

@parasharrajat do you like any of the proposals here?

melvin-bot · 2024-10-13T16:00:04Z

📣 It's been a week! Do we have any satisfactory proposals yet? Do we need to adjust the bounty for this issue? 💸

parasharrajat · 2024-10-14T15:32:59Z

@RachCHopkins I won't be available from 16 Oct for a few days, please reassign.

DylanDylann · 2024-10-14T15:41:24Z

I can take over this issue as C+

melvin-bot · 2024-10-14T18:02:23Z

@parasharrajat @RachCHopkins this issue was created 2 weeks ago. Are we close to approving a proposal? If not, what's blocking us from getting this issue assigned? Don't hesitate to create a thread in #expensify-open-source to align faster in real time. Thanks!

aldo-expensify · 2024-11-25T18:49:26Z

The latest conversation is here: https://github.com/Expensify/App/pull/51712/files#r1842905572

The backend work is still not done, it ended up being more work to fix than expected. I expect that the PR should be ready for review this week. Considering the number of cases we have to update, I'm planning to fix the cases that are known to be causing problems first (i.e. OpenReport), and then keep fixing the others with less priority to avoid making a huge PR.

aldo-expensify · 2024-11-30T01:54:48Z

I'm making the backend PR ready for review: https://github.com/Expensify/Auth/pull/13168

DylanDylann · 2024-12-09T07:43:57Z

@aldo-expensify @RachCHopkins This issue has already been fixed

Screen.Recording.2024-12-09.at.14.42.48.mov

cc @nkdengineer

DylanDylann · 2024-12-09T07:45:17Z

Other issues that have the same RCA have also been fixed

#51801 (comment)
#51045 (comment)

DylanDylann · 2024-12-09T07:46:43Z

@nkdengineer This PR seems unnecessary. I think we can close it

tgolen · 2024-12-09T15:50:58Z

I closed the PR this morning. I think this issue can be resolved, but @nkdengineer did a good amount of work on this, so I think the bounty should be reduced by 50% to $125 and still pay out for this @RachCHopkins

RachCHopkins · 2024-12-09T20:24:07Z

Sounds good to me @tgolen !

RachCHopkins · 2024-12-18T04:24:10Z

Payment Summary:

Contributor: @nkdengineer paid $125 via Upwork
Contributor+: @DylanDylann paid $125 via Upwork

Upwork job here

RachCHopkins · 2024-12-18T04:26:59Z

Contributor has been paid, the contract has been completed, and the Upwork post has been closed.

DylanDylann · 2024-12-26T04:16:23Z

@RachCHopkins As from our guideline, the contributor should get full compensation in this case. Could you please evaluate it again?

aldo-expensify · 2025-01-06T12:59:58Z

Reopening so this gets an answer: #49931 (comment)

RachCHopkins · 2025-01-07T03:20:17Z

Sorry @aldo-expensify @DylanDylann @tgolen I'm not understanding who should be paid what here.

@tgolen notes above that it should be a 50% payment because some work was done, but never implemented.

@DylanDylann has linked to a post that says that when someone has a solution which is implemented but later reverted and then not used moving forward due to internal Expensify BE work not being complete, they should get a full payment.

I see in this case that @nkdengineer's proposal was accepted, but not implemented. Which is different to the case discussed above (but please correct me if I'm wrong), and this whole issue was deemed unnecessary because everything was fixed elsewhere.

DylanDylann · 2025-01-07T03:41:03Z

@RachCHopkins From https://expensify.slack.com/archives/C02NK2DQWUX/p1723055151068559?thread_ts=1723015261.405329&cid=C02NK2DQWUX, @mallenexpensify said that:

If the deliverable has changed because the bug doesn't exist anymore, another PR has fixed it, or we've decided it's no longer a priority to fix, pay the contributor and C+ 100% of the job price (they were hired to do a job and did the work)

In this issue, contributors are already hired and the PR is also ready (almost work has already been done). But the deliverable has changed because we decided to fix it on the BE. So I think the contributors should get fully paid

Anyway, I will only note that in case we overlook the BZ SO. If the down price is intentional, let's close this one.

RachCHopkins · 2025-01-07T03:47:30Z

I'm just the girl with the checkbook here, so will wait on @tgolen and @aldo-expensify

tgolen · 2025-01-07T15:58:04Z

OK, I didn't know about that SO. The policy is clear in the SO, so let's just follow it and pay out 100%.

mallenexpensify · 2025-01-08T02:37:40Z

@DylanDylann thanks for the link, tag and quote text. That might be a lil outdated. Or, the below, in the main payment SO, might be more updated.

If a contributor has been hired for a job and we decide to close the job before it is successfully completed
50% payment to contributor and C+ if the contributor was 🎀👀🎀 by a C+ or hired/assigned.

100% due if contributor drafted a PR and requested a review.

100% due to C+ if they reviewed the PR.

One caveat here might be if the hired contributor wasn't working on their PR with urgency.

Reasoning.. I don't like the idea that someone could be hired, do no work on the PR then get full payment because something happened between the time they were hired and they submitted their PR, we don't want to encourage people to move slowly with a possible payout. (that def doesn't seem to be what happened here, just providing an example for the process).

Based on the above process, it appears 100% compensation is due based on @nkdengineer 's PR.

DylanDylann · 2025-01-08T02:54:50Z

This is a struggle with many challenges while working on it. In this issue, I also worked on reviewing the PR when it was created and discovered another problem that started a long discussion and end up with the fix on the BE side. After the fix is deployed I also help to verify all related issues are fixed

I raised this concern because I don't know the exact process in this case. @mallenexpensify Thanks for the official information.

mallenexpensify · 2025-01-08T03:26:40Z

@DylanDylann please always reference past posts and details. We appreciate the help and the all the invested time. My goal is to keep processes up to date, to make them fair and to discuss with contributors if they don't agree.

RachCHopkins · 2025-01-08T04:08:07Z

Ok, so I paid out the Contributor and C+ $125 each here. So it looks like I need to go back and do some remedial work as far as payments go. I shall work out the best way to do so!

RachCHopkins · 2025-01-10T01:16:04Z

Sorry for all the confusion @DylanDylann and @nkdengineer - I have done a bonus payment for each of you on the original contracts, to shore up the balance!

lanitochka17 added Daily KSv2 Bug Something is broken. Auto assigns a BugZero manager. labels Sep 30, 2024

melvin-bot bot assigned RachCHopkins Sep 30, 2024

melvin-bot bot added the Overdue label Oct 3, 2024

melvin-bot bot removed the Overdue label Oct 4, 2024

RachCHopkins added the External Added to denote the issue can be worked on by a contributor label Oct 6, 2024

melvin-bot bot changed the title ~~Chat - Group is created with duplicate user when chat is started quickly on a slow network~~ [$250] Chat - Group is created with duplicate user when chat is started quickly on a slow network Oct 6, 2024

melvin-bot bot added the Help Wanted Apply this label when an issue is open to proposals by contributors label Oct 6, 2024

RachCHopkins added this to [#whatsnext] #quality Oct 6, 2024

melvin-bot bot assigned parasharrajat Oct 6, 2024

melvin-bot bot added the Overdue label Oct 10, 2024

melvin-bot bot removed the Overdue label Oct 14, 2024

RachCHopkins assigned DylanDylann Oct 15, 2024

melvin-bot bot removed the Help Wanted Apply this label when an issue is open to proposals by contributors label Oct 15, 2024

c3024 mentioned this issue Dec 9, 2024

[$250] Chat - RBR displayed on the wrong expense preview component for a few seconds #51989

Closed

8 tasks

RachCHopkins closed this as completed Dec 18, 2024

github-project-automation bot moved this from Bugs and Follow Up Issues to Done in [#whatsnext] #expense Dec 18, 2024

aldo-expensify reopened this Jan 6, 2025

aldo-expensify added Daily KSv2 and removed Monthly KSv2 Reviewing Has a PR in review labels Jan 6, 2025

RachCHopkins closed this as completed Jan 10, 2025

DylanDylann mentioned this issue Jan 10, 2025

[$250] IOU - GBR is missing from the partially approved expense #52650

Closed

8 tasks

DylanDylann mentioned this issue Jan 27, 2025

[$250] Expense-App shows not here page instead of auto refresh report when exchanging expense offline #54512

Open

8 tasks

[$250] Chat - Group is created with duplicate user when chat is started quickly on a slow network #49931

[$250] Chat - Group is created with duplicate user when chat is started quickly on a slow network #49931

Comments

lanitochka17 commented Sep 30, 2024 • edited by aldo-expensify Loading

Action Performed:

Expected Result:

Actual Result:

Workaround:

Platforms:

Screenshots/Videos

melvin-bot bot commented Sep 30, 2024

RachCHopkins commented Oct 1, 2024

RachCHopkins commented Oct 1, 2024

RachCHopkins commented Oct 4, 2024

mvtglobally commented Oct 4, 2024

RachCHopkins commented Oct 6, 2024

melvin-bot bot commented Oct 6, 2024

melvin-bot bot commented Oct 6, 2024

nkdengineer commented Oct 7, 2024 • edited Loading

Proposal

Please re-state the problem that we are trying to solve in this issue.

What is the root cause of that problem?

What changes do you think we should make in order to solve the problem?

What alternative solutions did you explore? (Optional)

c3024 commented Oct 7, 2024

Proposal

Please re-state the problem that we are trying to solve in this issue.

What is the root cause of that problem?

What changes do you think we should make in order to solve the problem?

What alternative solutions did you explore? (Optional)

melvin-bot bot commented Oct 10, 2024

RachCHopkins commented Oct 11, 2024

melvin-bot bot commented Oct 13, 2024

parasharrajat commented Oct 14, 2024

DylanDylann commented Oct 14, 2024

melvin-bot bot commented Oct 14, 2024

aldo-expensify commented Nov 25, 2024

aldo-expensify commented Nov 30, 2024

DylanDylann commented Dec 9, 2024

DylanDylann commented Dec 9, 2024

DylanDylann commented Dec 9, 2024

tgolen commented Dec 9, 2024

RachCHopkins commented Dec 9, 2024

RachCHopkins commented Dec 18, 2024

RachCHopkins commented Dec 18, 2024

DylanDylann commented Dec 26, 2024

aldo-expensify commented Jan 6, 2025

RachCHopkins commented Jan 7, 2025

DylanDylann commented Jan 7, 2025

RachCHopkins commented Jan 7, 2025

tgolen commented Jan 7, 2025

mallenexpensify commented Jan 8, 2025

DylanDylann commented Jan 8, 2025

mallenexpensify commented Jan 8, 2025

RachCHopkins commented Jan 8, 2025

RachCHopkins commented Jan 10, 2025

lanitochka17 commented Sep 30, 2024 •

edited by aldo-expensify

Loading

nkdengineer commented Oct 7, 2024 •

edited

Loading