Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[$500] [HOLD for payment 2023-12-20] The whole queue of offline requests are processed when you come back online, delaying online requests #28172

Closed
muttmuure opened this issue Sep 25, 2023 · 33 comments
Assignees
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Daily KSv2 External Added to denote the issue can be worked on by a contributor Help Wanted Apply this label when an issue is open to proposals by contributors NewFeature Something to build that is a new item.

Comments

@muttmuure
Copy link
Contributor

muttmuure commented Sep 25, 2023

https://expensify.slack.com/archives/C05LX9D6E07/p1695105894373289

Problem

When we're offline, we'll build up a ton of requests if we keep navigating. OpenReport is probably the biggest offender. That's bad because we'll process the whole queue once we're back online, and because of that requests that I actually need to execute when I'm online will have to wait until the whole queue is processed.

Let's say you have navigated to 5 reports while offline.

When you go online:

We'll start processing the queue.
Open report usually takes a couple seconds per request, so we'll wait 10s until you do all requests
Because we also queue the onyxUpdates for write requests, we'll need for the write operation for all of them to finish before we actually see anything in the report we just navigated

Solution

Only call OpenReport exactly once per report

Use a report metadata key to store if we have ever called the OpenReport for a specific report

Issue OwnerCurrent Issue Owner: @sakluger
Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~01697b67b9aa3280da
  • Upwork Job ID: 1742376039165644800
  • Last Price Increase: 2024-01-03
@melvin-bot melvin-bot bot added the Monthly KSv2 label Sep 29, 2023
@muttmuure muttmuure changed the title [HOLD on #reliable-updates] The whole queue of offline requests are processed when you come back online, delaying online requests The whole queue of offline requests are processed when you come back online, delaying online requests Oct 5, 2023
@roryabraham
Copy link
Contributor

roryabraham commented Oct 19, 2023

One idea for a solution to help with the scenario where there are many essentially duplicate requests in the queue would be to add an idempotencyKey to the request object. In the case of OpenReport, I think this could be some combination of the command name and reportID: OpenReport?reportID=1234. Then whenever we're adding a request to the sequentialQueue, we look for any requests with a matching idempotencyKey and only keep the newest one, replacing any older duplicate requests in the queue with the newest one.

Using this technique we would de-dupe requests in the queue to help reduce the size of the queue when we come back online

Before After
image image

Edit: You don't just "keep the newest instance of an idempotent request only, I think you'd want to replace the previous instance with the newest one, so that the order of operations is preserved

@roryabraham
Copy link
Contributor

roryabraham commented Oct 19, 2023

If we wanted to take it a step further, I have another idea building off an idea I've proposed before in a different context...

Solution:

Replace the sequential queue with a directed acyclic graph (DAG) of requests, which we can call the RequestGraph.

  • Each request is an object with an ID and an arbitrary set of data. Each request object represents a nodes in the graph.
  • Each request can have zero or more optional parentRequests. The presence of a parentRequest represents a directed edge in the graph.
  • API.write will add a node (and potentially an edge) to the graph. It will throw an error if the edge introduces a cycle in the graph. 🚫 ♻️
  • The RequestGraph, will continuously perform a parallel breadth-first search from every root node in the graph. i.e: whenever a request is added to the graph, find all root nodes in the graph, and for each:
    • Process the request, then process all child requests, then process all their children, etc...
    • If any parent request fails, then all its descendant requests are cancelled
    • Only once there are no more children originating from this root, delete the parent request and all descendants
  • Requests with a parentRequest can reference data from the parent request. Depending on the server response, a parent request can update its data and dependent children will be able to reference that updated data (eliminating the need for the HandleUnusedOptimisticID middleware).
  • Just like with the sequentialQueue, requests can have an idempotencyKey, and if a node with the matching idempotencyKey is found in the graph we replace the old request with the new one.

This means that instead of processing write requests one-by-one in the sequential queue, we can process non-conflicting requests in parallel and only have to process requests that depend on other parent requests to run in sequence. This could massively improve the speed at which we process requests made offline, especially on a slower network.

Before After
image image

It's worth nothing that this would be a pretty big change, so if we were going to roll this out I think the prudent thing to do would be to make the default behavior of the RequestGraph the same as the SequentialQueue. i.e: any time we add a request to the RequestGraph, we assume that its parentRequest is the last request added to the graph. Thus at first the graph ends up just being a linear queue as it is today.

flowchart LR
    subgraph first requests
        A(Request A)
    end

    subgraph second requests
        B(Request B)
    end

    subgraph third requests
        C(Request C)
    end

    subgraph fourth requests
        D(Request D)
    end

    subgraph fifth requests
        E(Request E)
    end

    subgraph sixth requests
        F(Request F)
    end

    A --> B --> C --> D --> E --> F
    
Loading

Only when we're certain which parentRequest a new request is dependent upon, the developer will call API.write and provide the ID of a parentRequest. This will mean that request can happen as soon as that parent request is done, instead of having to wait for unrelated requests to finish.

flowchart LR
    subgraph first requests
        A(Request A)
    end

    subgraph second requests
        B(Request B)
        D(Request D)
    end

    subgraph third requests
        C(Request C)
    end

    subgraph fourth requests
        E(Request E)
    end

    subgraph fifth requests
        F(Request F)
    end

    A --> B --> C
    A --> D
    C --> E --> F
    
Loading

If we know that a request is independent of any other requests in the graph, we can provide an empty array for parentRequests, and that would allow it to run immediately the next time the RequestGraph processes.

flowchart LR
    subgraph first requests
        A(Request A)
        F(Request F)
    end

    subgraph second requests
        B(Request B)
        D(Request D)
    end

    subgraph third requests
        C(Request C)
    end

    subgraph fourth requests
        E(Request E)
    end

    A --> B --> C
    A --> D
    C --> E
    
Loading

@WojtekBoman
Copy link
Contributor

@muttmuure Can you assign me and @kosmydel to this issue?

@melvin-bot
Copy link

melvin-bot bot commented Oct 23, 2023

📣 @WojtekBoman! 📣
Hey, it seems we don’t have your contributor details yet! You'll only have to do this once, and this is how we'll hire you on Upwork.
Please follow these steps:

  1. Make sure you've read and understood the contributing guidelines.
  2. Get the email address used to login to your Expensify account. If you don't already have an Expensify account, create one here. If you have multiple accounts (e.g. one for testing), please use your main account email.
  3. Get the link to your Upwork profile. It's necessary because we only pay via Upwork. You can access it by logging in, and then clicking on your name. It'll look like this. If you don't already have an account, sign up for one here.
  4. Copy the format below and paste it in a comment on this issue. Replace the placeholder text with your actual details.
    Screen Shot 2022-11-16 at 4 42 54 PM
    Format:
Contributor details
Your Expensify account email: <REPLACE EMAIL HERE>
Upwork Profile Link: <REPLACE LINK HERE>

@kosmydel
Copy link
Contributor

Comment to enable the assignment :)

@WojtekBoman
Copy link
Contributor

@roryabraham I have one question about the first proposed approach to solve this problem. Is the reportId parameter sufficient to make requests idempotent? I checked params of this request and besides the reportId it also accepts params: emailList, accountIDList etc. In the first approach if we send OpenReport request twice with the same reportID but with different additional parameters, we will get response only from the last sent request. Is this okay? I'd like to be sure that this how it should work.

@roryabraham
Copy link
Contributor

Is the reportId parameter sufficient to make requests idempotent?

It depends on the request. Each request may have it's own idempotencyKey, and it's up to us as developers to determine what data is appropriate to use for the idempotencyKey.

In the first approach if we send OpenReport request twice with the same reportID but with different additional parameters, we will get response only from the last sent request. Is this okay?

I think that is indeed ok for OpenReport. The only data that's written in OpenReport is when the user last read the report, so we only care about the last call.

@roryabraham
Copy link
Contributor

if other params than reportID are included that means we are creating a new report

@mountiny made a great point here that if we make a report optimistically OpenReport will have more params that we don't want to just throw away.

So maybe what we want to do is – instead of just throwing away the earlier request and only keeping the later one, we should:

  • merge the params, optimisticData, successData, failureData of the later request into the earlier one
  • Keep the request in the same position in the queue that the earlier request had

@roryabraham roryabraham self-assigned this Oct 25, 2023
@roryabraham roryabraham added Weekly KSv2 and removed Monthly KSv2 labels Oct 25, 2023
@WojtekBoman
Copy link
Contributor

WojtekBoman commented Oct 25, 2023

@roryabraham Okay so I would like to define a strategy for merging data from old and new OpenReport requests. For optimisticData, failureData and successData it's easy, we can merge data by key property. To the new request will be added these objects from old request which are not included in it. I have a couple of questions about how it should work when we want to merge params from two requests.

  1. How to determine which parameters should be replaced between the old and new request? We can analyze OpenReport request. When we send this request with defined value for emailList param and next request is sent without it, what should be the result of merging these two requests? Should the merged request have value for this param defined?
  2. Is there a risk that merging request params will cause side effects?

@kosmydel
Copy link
Contributor

Hey @roryabraham. Together with Wojciech Boman we have another set of questions about GraphQueue implementation.

General questions:

  • Is it possible that the request will have more than one parent? If so:
    • Is the following approach correct? We should wait until all the parents' requests are resolved, and then proceed with the children. We are thinking about a counter, which will count for how many parent requests the request still has to wait.
    • In which cases can it happen?
  • How should we enable children to reference data from parent requests? In which cases can it be helpful?
  • How can a parent request update dependent children's data? In which cases can it happen?
  • From where would a developer obtain the requests’ IDs to pass to the parentRequests?

Idempotency questions:
What if an offline user makes two actions: post and then delete? Do we need to send these requests when we know that the second request cancels the first? For example when we add and remove emojis from a text message.

@roryabraham
Copy link
Contributor

How to determine which parameters should be replaced between the old and new request? We can analyze OpenReport request. When we send this request with defined value for emailList param and next request is sent without it, what should be the result of merging these two requests? Should the merged request have value for this param defined?

Yes, I think the merged request should have the value for that param defined. Basically just use lodashMerge I think

Is there a risk that merging request params will cause side effects?

Possibly – we have to think about and test it carefully. Of course, we should make this feature opt-in by just providing an idempotencyKey only for requests which can be safely de-duped. So we take it one request at a time and migrate each carefully.

@roryabraham
Copy link
Contributor

Is it possible that the request will have more than one parent?

I can't think of any cases when this would happen, but it seems likely that there may be such a case. I don't think there's much complexity that will be added by having parentRequests be an array of requests instead of just a single request. So instead of just parentRequest.then(...) it would be Promise.allSettled(parentRequests).then(...)

@roryabraham
Copy link
Contributor

How should we enable children to reference data from parent requests? In which cases can it be helpful?

One case when it would be helpful is already solved in another way by the src/libs/Middleware/HandleUnusedOptimisticID middleware. If you disable the HandleUnusedOptimisticID middleware, you can reproduce the following issue:

  1. Alice and Bob have never chatted before.
  2. Alice goes offline
  3. Bob (online) sends a DM to Alice
  4. Alice sends several messages to Bob
  5. Alice comes back online, then:
    1. Due to some back-end code we have OpenReport will succeed, but the optimistic reportID passed as a param is not used because the report already exists
    2. The queued AddComment requests from Alice will all fail, because they are referencing the unused optimistic reportID.

This could be a case when AddComment can reference data from the parent, eliminating the need for the slightly hacky HandleUnusedOptimisticID middleware

@roryabraham
Copy link
Contributor

roryabraham commented Oct 30, 2023

From where would a developer obtain the requests’ IDs to pass to the parentRequests?

I suppose it would be returned from API.write. Due to the design of our network code, that function should not return a promise, but could be updated to synchronously return the requestID

@melvin-bot melvin-bot bot added Reviewing Has a PR in review Weekly KSv2 and removed Weekly KSv2 labels Oct 31, 2023
@muttmuure
Copy link
Contributor Author

#30425

Copy link

melvin-bot bot commented Nov 22, 2023

⚠️ Looks like this issue was linked to a Deploy Blocker here

If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.

If a regression has occurred and you are the assigned CM follow the instructions here.

If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.

Copy link

melvin-bot bot commented Nov 22, 2023

⚠️ Looks like this issue was linked to a Deploy Blocker here

If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.

If a regression has occurred and you are the assigned CM follow the instructions here.

If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.

Copy link

melvin-bot bot commented Nov 22, 2023

⚠️ Looks like this issue was linked to a Deploy Blocker here

If you are the assigned CME please investigate whether the linked PR caused a regression and leave a comment with the results.

If a regression has occurred and you are the assigned CM follow the instructions here.

If this regression could have been avoided please consider also proposing a recommendation to the PR checklist so that we can avoid it in the future.

@melvin-bot melvin-bot bot changed the title [HOLD for payment 2023-11-30] The whole queue of offline requests are processed when you come back online, delaying online requests [HOLD for payment 2023-12-20] [HOLD for payment 2023-11-30] The whole queue of offline requests are processed when you come back online, delaying online requests Dec 13, 2023
@melvin-bot melvin-bot bot removed the Reviewing Has a PR in review label Dec 13, 2023
Copy link

melvin-bot bot commented Dec 13, 2023

Reviewing label has been removed, please complete the "BugZero Checklist".

Copy link

melvin-bot bot commented Dec 13, 2023

The solution for this issue has been 🚀 deployed to production 🚀 in version 1.4.11-25 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2023-12-20. 🎊

After the hold period is over and BZ checklist items are completed, please complete any of the applicable payments for this issue, and check them off once done.

  • External issue reporter
  • Contributor that fixed the issue
  • Contributor+ that helped on the issue and/or PR

For reference, here are some details about the assignees on this issue:

@melvin-bot melvin-bot bot added the Overdue label Dec 21, 2023
@roryabraham roryabraham changed the title [HOLD for payment 2023-12-20] [HOLD for payment 2023-11-30] The whole queue of offline requests are processed when you come back online, delaying online requests [HOLD for payment 2023-12-20] The whole queue of offline requests are processed when you come back online, delaying online requests Dec 22, 2023
@roryabraham
Copy link
Contributor

New PR is on prod: #32246

@melvin-bot melvin-bot bot removed the Overdue label Dec 22, 2023
@roryabraham
Copy link
Contributor

So I believe C+ payment is due here to @alitoshmatov

@roryabraham
Copy link
Contributor

My mistake @alitoshmatov, just realized that we don't have anyone assigned to this issue to help issue payment for the C+ review of #32246. Let's get that sorted...

@melvin-bot melvin-bot bot removed the Overdue label Jan 1, 2024
@roryabraham roryabraham added NewFeature Something to build that is a new item. Overdue labels Jan 1, 2024
Copy link

melvin-bot bot commented Jan 1, 2024

@melvin-bot melvin-bot bot removed the Overdue label Jan 1, 2024
@roryabraham
Copy link
Contributor

@sakluger only action-item for you here is to issue a standard C+ review payment to @alitoshmatov for #32246. Thanks!

@sakluger sakluger added the External Added to denote the issue can be worked on by a contributor label Jan 3, 2024
Copy link

melvin-bot bot commented Jan 3, 2024

Job added to Upwork: https://www.upwork.com/jobs/~01697b67b9aa3280da

@melvin-bot melvin-bot bot changed the title [HOLD for payment 2023-12-20] The whole queue of offline requests are processed when you come back online, delaying online requests [$500] [HOLD for payment 2023-12-20] The whole queue of offline requests are processed when you come back online, delaying online requests Jan 3, 2024
@melvin-bot melvin-bot bot added the Help Wanted Apply this label when an issue is open to proposals by contributors label Jan 3, 2024
Copy link

melvin-bot bot commented Jan 3, 2024

Current assignee @alitoshmatov is eligible for the External assigner, not assigning anyone new.

@melvin-bot melvin-bot bot added Daily KSv2 and removed Weekly KSv2 labels Jan 3, 2024
@sakluger
Copy link
Contributor

sakluger commented Jan 3, 2024

@alitoshmatov I sent you an offer on Upwork, please let me know once you've accepted. Thanks!

@alitoshmatov
Copy link
Contributor

@sakluger Accepted the offer

@sakluger
Copy link
Contributor

sakluger commented Jan 4, 2024

Completed payment, thanks!

@sakluger sakluger closed this as completed Jan 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Daily KSv2 External Added to denote the issue can be worked on by a contributor Help Wanted Apply this label when an issue is open to proposals by contributors NewFeature Something to build that is a new item.
Projects
Development

No branches or pull requests

6 participants