Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Android - User is logged out randomly #5619

Closed
isagoico opened this issue Sep 30, 2021 · 44 comments
Closed

Android - User is logged out randomly #5619

isagoico opened this issue Sep 30, 2021 · 44 comments
Assignees
Labels
Daily KSv2 Engineering Reviewing Has a PR in review

Comments

@isagoico
Copy link

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!


Action Performed:

  1. Use the Android app daily and don't log out from the account.

Expected Result:

User should not be logged out randomly.

Actual Result:

User is being logged out randomly.

Workaround:

Unknown

Platform:

Where is this issue occurring?

  • Android

Version Number: 1.1.2-0

Reproducible in staging?: Yes
Reproducible in production?: Yes

Logs: https://stackoverflow.com/c/expensify/questions/4856

Notes/Photos/Videos: Any additional supporting documentation
We haven't found reproducible steps yet. Seem like a random scenario. @quinthar has been able to reproduce twice.

Expensify/Expensify Issue URL:

Issue reported by: @quinthar
Slack conversation: https://expensify.slack.com/archives/C01GTK53T8Q/p1632379894048100

View all open jobs on GitHub

@MelvinBot
Copy link

Triggered auto assignment to @marcochavezf (Engineering), see https://stackoverflow.com/c/expensify/questions/4319 for more details.

@marcochavezf
Copy link
Contributor

marcochavezf commented Oct 1, 2021

I wasn't able to reproduce this issue yet, but some of the ideas to try are (after reading some of the comments of the Slack thread):

  • Force storage eviction, which could be wiping out the authToken.
  • Look into the code when/if the authToken is renewed/deleted and then dive deeper into the logs with some insight taken after checking out the code.
  • Use a debuggable apk to look into the local logs when the error happens.

@MelvinBot
Copy link

@marcochavezf Uh oh! This issue is overdue by 2 days. Don't forget to update your issues!

@marcochavezf
Copy link
Contributor

marcochavezf commented Oct 5, 2021

Seems an invalid authToken is not part of the issue. I tested out using an invalid authToken with command=Authenticate and the server is responding with a renewed authToken which keeps the user logged in.


The log out happens when the partnerPassword is invalid or the session value (from Local Storage) is deleted (which could be deleted by a bug in react-native-onyx on Android):

@MelvinBot MelvinBot removed the Overdue label Oct 5, 2021
@kidroca
Copy link
Contributor

kidroca commented Oct 6, 2021

@marcochavezf
Regarding the bug being in Onyx - it's possible but if it's caused by the AsyncStorage library that we use internally there (I don't think it's likely, we can check their issue board for similar reports)
I've traced how Onyx eviction works - it will never even attempt to clear the full storage, it will only remove old chat messages we're not currently accessing

Onyx would never call Onyx.clear to clear storage on its own - only through an external call (from Expensify.App)
The only such call is here:

function clearStorageAndRedirect(errorMessage) {
const activeClients = currentActiveClients;
const preferredLocale = currentPreferredLocale;
// Clearing storage discards the authToken. This causes a redirect to the SignIn screen
Onyx.clear()

This is triggered on regular user log outs or cases like the one you've posted above where partnerPassword is invalid

@kidroca
Copy link
Contributor

kidroca commented Oct 6, 2021

A very likely cause I think is credentials not being read from storage fast enough.

See how credentials don't have an initial value here - it will come from storage at some point

App/src/libs/API.js

Lines 11 to 15 in 04083b1

let credentials;
Onyx.connect({
key: ONYXKEYS.CREDENTIALS,
callback: val => credentials = val,
});

If we don't have it by the time reauthenticate is called the request will probably fail (in a similar way it fails for invalid partnerPassword here: #5619 (comment)):

App/src/libs/API.js

Lines 267 to 268 in 04083b1

partnerUserID: credentials.autoGeneratedLogin,
partnerUserSecret: credentials.autoGeneratedPassword,

On my physical Android device I've seen delays of up to 3 sec. for retrieving a value from storage during app launch

Can you trace server logs to try and confirm this - Authenticate being called with no (or empty) partnerUserID and partnerUserSecret?


IMO we need to ensure that credentials are first read from storage and only then start the request.
Exposing Onyx.get would be beneficial for cases like this e.g.

Onxy.get(ONYXKEYS.CREDENTIALS)
    .then(({ autoGeneratedLogin, autoGeneratedPassword }) => Authenticate({
        useExpensifyLogin: false,
        partnerName: CONFIG.EXPENSIFY.PARTNER_NAME,
        partnerPassword: CONFIG.EXPENSIFY.PARTNER_PASSWORD,
        autoGeneratedLogin,
        autoGeneratedPassword,
        authToken,
    }));

cc @marcaaron and @tgolen regarding Onyx.get suggestion


There are 3 instances where we write session data to Onyx and could be overwriting something by mistake, I've traced them further and it doesn't seem likely

setSuccessfulSignInData

function setSuccessfulSignInData(data) {
PushNotification.register(data.accountID);
Onyx.merge(ONYXKEYS.SESSION, {
shouldShowComposeInput: true,
..._.pick(data, 'authToken', 'accountID', 'email', 'encryptedAuthToken'),
});
}

signInWithShortLivedToken

function signInWithShortLivedToken(accountID, email, shortLivedToken, encryptedAuthToken) {
Onyx.merge(ONYXKEYS.ACCOUNT, {...CONST.DEFAULT_ACCOUNT_DATA, loading: true});
createTemporaryLogin(shortLivedToken, encryptedAuthToken, email).then((response) => {
Onyx.merge(ONYXKEYS.SESSION, {
authToken: shortLivedToken,
accountID,
email,
});

reauthenticate

App/src/libs/API.js

Lines 278 to 283 in 04083b1

// Update authToken in Onyx and in our local variables so that API requests will use the
// new authToken
Onyx.merge(ONYXKEYS.SESSION, {
authToken: response.authToken,
encryptedAuthToken: response.encryptedAuthToken,
});

@marcochavezf
Copy link
Contributor

A very likely cause I think is credentials not being read from storage fast enough.

@kidroca you're right about this ^ but with authToken. There's a race condition issue here, I added a setTimeout to 3 secs to get authToken, refreshed the browser and I was logged out:

Screen Shot 2021-10-06 at 14 13 14

IMO we need to ensure that credentials are first read from storage and only then start the request.
Exposing Onyx.get would be beneficial for cases like this e.g.

I think we'll need to ensure that both credentials and authToken are retrieved first from local storage.

@isagoico
Copy link
Author

isagoico commented Oct 8, 2021

Looks like @johnmlee101 Experienced this issue (or a extremely similar one) in iOS

Image.from.iOS.2.MOV

@marcochavezf
Copy link
Contributor

Looks like @johnmlee101 Experienced this issue (or a extremely similar one) in iOS

Image.from.iOS.2.MOV

Hmm, probably it's related but I'm not really sure. Today I will implement and test the proposed solution to avoid the race condition. Hopefully, that will also fix the issue in the video.

@MelvinBot MelvinBot removed the Overdue label Oct 11, 2021
@marcochavezf
Copy link
Contributor

Created a function to retrieve the authToken using Promise to ensure it's available right before it's used, but the app ends up calling the server several times (I'm still investigating why):

Screen Shot 2021-10-11 at 19 12 58

Screen Shot 2021-10-11 at 19 16 15

@tgolen
Copy link
Contributor

tgolen commented Oct 12, 2021

Unfortunately, I am going OOO for the next few days, but I'll be back next week and I would like to be involved with helping come up with a solution for this.

As it stands right now, I have some concerns with the suggested solution.

  1. This is touching some very core code so changes must be done extremely carefully, and this is maybe something that we should begin writing unit tests around so that we protect ourselves from basic regressions. It would be a huge bonus to have a test that demonstrates the race condition.
  2. I think the current solution is mixing sync and async code in a dangerous way (eg. getAuthToken() will run async while all the code outside of it still continues running on the normal sync thread).

I'm definitely still not on board the Onyx.get() train yet, and I think doing something more like implementing a retry if Onyx hasn't been initialized yet (ie. if authToken is undefined then retry). AFAIK undefined should exist until Onyx is initialized, at which point the value would either be null or something else.

@kidroca
Copy link
Contributor

kidroca commented Oct 12, 2021

Yeah that change won't work reliably. The promise from getAuthToken is not returned and the caller would not be able to wait for it. The request that needs an auth token might still start without one

The call for addDefaultValues happens here:

App/src/libs/Network.js

Lines 199 to 201 in 04083b1

const finalParameters = _.isFunction(enhanceParameters)
? enhanceParameters(queuedRequest.command, requestData)
: requestData;

And we see the request starts soon after the call. We either need to wait for finalParameters to be set, or we need to have a different approach


Retrying would introduce a 1sec. delay due to how the network queue works
Waiting a promise would start the request at the exact time we have a value
And it might be cleaner:

enhanceParameters(queuedRequest.command, requestData)
  .then(params => HttpUtils.xhr(req.command, params, req.type, req.shouldUseSecure)
  .then(response => onResponse(req, response))
  .catch(error => onError(req, error))

vs

App/src/libs/Network.js

Lines 199 to 213 in 04083b1

const finalParameters = _.isFunction(enhanceParameters)
? enhanceParameters(queuedRequest.command, requestData)
: requestData;
// Check to see if the queue has paused again. It's possible that a call to enhanceParameters()
// has paused the queue and if this is the case we must return. We don't retry these requests
// since if a request is made without an authToken we sign out the user.
if (!canMakeRequest(queuedRequest)) {
return;
}
HttpUtils.xhr(queuedRequest.command, finalParameters, queuedRequest.type, queuedRequest.shouldUseSecure)
.then(response => onResponse(queuedRequest, response))
.catch(error => onError(queuedRequest, error));
});

If for whatever reason enhanceParameters decides that we can't make the request it can throw an error to reject the Promise and prevent a request from being executed


Not sure if it's possible, but. I would prefer to throw an error here

App/src/libs/Network.js

Lines 199 to 201 in 04083b1

const finalParameters = _.isFunction(enhanceParameters)
? enhanceParameters(queuedRequest.command, requestData)
: requestData;

If we rely on enhancer we should not have a fallback - falling back exposes another place where we can fail to add an auth token to a request that needs one, and endup with the user in a logged out state

@kidroca
Copy link
Contributor

kidroca commented Oct 12, 2021

I'm also not certain about the authToken being the culprit - the user specific portion of the app mounts only when we have an AuthToken

<NavigationRoot authenticated={Boolean(this.getAuthToken())} />

Then this component gets mounted and starts a bunch of requests:

componentDidMount() {
NetworkConnection.listenForReconnect();
PusherConnectionManager.init();
Pusher.init({

But at that point we already have the token...

The case where we start a request without a token is possible if we have some persisted requests that are initiated before the token is ready, but since they also have to be read from storage it's seems unlikely

I still think it's the credentials not being available

  • we've retrieved a token from storage but it's expired
  • we started a request to re-authenticate but stored credentials weren't yet read

A test that demonstrates how this should simulate an app launch where we have a logged in user, and storage being read slow enough to allow for the above to happen
And probably another test regarding the persisted request being made without a token


On slack we also talked about logging the issue to the server, so that we know what happened exactly. This should also review whether the issue occurred after a persisted request and any other circumstances

@marcochavezf
Copy link
Contributor

marcochavezf commented Oct 12, 2021

I think the current solution is mixing sync and async code in a dangerous way (eg. getAuthToken() will run async while all the code outside of it still continues running on the normal sync thread).

@tgolen yeah agreed, after testing with getAuthToken() I realized the solution mixes async and sync code (found out Network.registerParameterEnhancer(addDefaultValuesToParameters) is a synchronou operation). I like the idea of using a retry, I'm going to implement it and test it this week.

It would be a huge bonus to have a test that demonstrates the race condition.

👍🏽 I will check how to add a unit test to demonstrate this, I think we can use a timeout like here to reproduce it in unit tests: #5619 (comment)

Btw, is there a diagram of the login in the app initilization? (I can create a new one if it doesn't exist, I think it can help us to understand better the complexity of this flow).

@marcochavezf
Copy link
Contributor

marcochavezf commented Oct 12, 2021

I'm also not certain about the authToken being the culprit - the user specific portion of the app mounts only when we have an AuthToken

<NavigationRoot authenticated={Boolean(this.getAuthToken())} />

Then this component gets mounted and starts a bunch of requests:

componentDidMount() {
NetworkConnection.listenForReconnect();
PusherConnectionManager.init();
Pusher.init({

But at that point we already have the token...

The case where we start a request without a token is possible if we have some persisted requests that are initiated before the token is ready, but since they also have to be read from storage it's seems unlikely

I still think it's the credentials not being available

  • we've retrieved a token from storage but it's expired
  • we started a request to re-authenticate but stored credentials weren't yet read

A test that demonstrates how this should simulate an app launch where we have a logged in user, and storage being read slow enough to allow for the above to happen And probably another test regarding the persisted request being made without a token

On slack we also talked about logging the issue to the server, so that we know what happened exactly. This should also review whether the issue occurred after a persisted request and any other circumstances

Thanks @kidroca I will look at AuthScreens component code and check if there's an issue there, I will begin to create a diagram to have the whole picture of the related components / functions and get better insights where the 🐛 is.

My hypothesis of the authToken is based on a timeout that I added in the Onyx.connect which mimics a little delay that could exist in the AsyncStorage on a physical Android device:

Screen.Recording.2021-10-12.at.18.32.38.mov

In the video I'm already logged in (I refreshed the browser just to verify everything is working as expected), then I toggle a piece of code that adds the setTimeout in Onyx.connect and after compiling and refreshing the app I'm logged out.

The app returns the error Cannot read properties of undefined (reading 'returnValueList') which happens here:

App/src/libs/Network.js

Lines 199 to 210 in ddb3cbd

const finalParameters = _.isFunction(enhanceParameters)
? enhanceParameters(queuedRequest.command, requestData)
: requestData;
// Check to see if the queue has paused again. It's possible that a call to enhanceParameters()
// has paused the queue and if this is the case we must return. We don't retry these requests
// since if a request is made without an authToken we sign out the user.
if (!canMakeRequest(queuedRequest)) {
return;
}
HttpUtils.xhr(queuedRequest.command, finalParameters, queuedRequest.type, queuedRequest.shouldUseSecure)

where finalParameters is undefined because of this return (which is expected because the authToken is undefined):

App/src/libs/API.js

Lines 51 to 65 in 50beee9

function addDefaultValuesToParameters(command, parameters) {
const finalParameters = {...parameters};
if (isAuthTokenRequired(command) && !parameters.authToken) {
// If we end up here with no authToken it means we are trying to make an API request before we are signed in.
// In this case, we should cancel the current request by pausing the queue and clearing the remaining requests.
if (!authToken) {
redirectToSignIn();
console.debug('A request was made without an authToken', {command, parameters});
Network.pauseRequestQueue();
Network.clearRequestQueue();
Network.unpauseRequestQueue();
return;
}

@marcochavezf
Copy link
Contributor

marcochavezf commented Oct 15, 2021

I was thinking about using a flag to check if Onyx is ready (which doesn't mix async and sync operations and could be a little bit simpler) so I wanted to give it a try and test it.

I created a PR (WIP) where I added the flag to check if Onyx is ready when the callback of Network.registerParameterEnhancer is called. I added just for testing a setTimeout in the Onyx.connect callback that retrieves the authToken and seems to be working on the async scenario (the user isn't logged out when the Onyx callback is called after the app makes a network request, but the error Cannot read properties of undefined (reading 'returnValueList') is still thrown).

I will be testing on an Android device/simulator and check if I can fix the returnValueList error, but meanwhile, the PR is open to feedback.

@MelvinBot MelvinBot removed the Overdue label Oct 15, 2021
@MelvinBot
Copy link

@marcochavezf Whoops! This issue is 2 days overdue. Let's get this updated quick!

@marcaaron
Copy link
Contributor

marcaaron commented Oct 25, 2021

any ideas on how to fix the unit tests?

I have all the ideas about how to fix them 🙃 What problem are you running into specifically?
Also maybe it would be best to 1:1 on this?

@marcaaron
Copy link
Contributor

@marcaaron what do you think about this solution? -> #6055 :)

Sorry I have only half been following this conversation and there's a lot of context I'd need to catch up on.

If I reviewed the code now I'd probably say the Network should not have any awareness of the authToken. Network ideally should be a simple queue and handler of requests passing them back to the API module so we can reauthenticate, etc.

So, I think we should separate the concern of whether requests are "ready" to be processed from the things that might dictate that readiness.

Just from looking at the code it looks like we are trying to avoid making any request before Onyx is initialized. But we are using the authToken either being a string or null. That feels like a possible smell because we are assigning special significance to the authToken and Network does not need to know about it.

I would much rather do something like this...

Onyx.connect({
    key: 'session',
    callback: (val) => {
        authToken = val ? val.authToken : null,
        Network.setIsReady(true);
    },
});

However, this kind of begs the question of whether it matters whether we are "ready" or not?

Can you maybe walk us through step by step how the unexpected log out happens? And why your PR will fix it?

@marcochavezf
Copy link
Contributor

marcochavezf commented Oct 26, 2021

@marcaaron what do you think about this solution? -> #6055 :)

Sorry I have only half been following this conversation and there's a lot of context I'd need to catch up on.

If I reviewed the code now I'd probably say the Network should not have any awareness of the authToken. Network ideally should be a simple queue and handler of requests passing them back to the API module so we can reauthenticate, etc.

So, I think we should separate the concern of whether requests are "ready" to be processed from the things that might dictate that readiness.

Just from looking at the code it looks like we are trying to avoid making any request before Onyx is initialized. But we are using the authToken either being a string or null. That feels like a possible smell because we are assigning special significance to the authToken and Network does not need to know about it.

I would much rather do something like this...

Onyx.connect({
    key: 'session',
    callback: (val) => {
        authToken = val ? val.authToken : null,
        Network.setIsReady(true);
    },
});

@marcaaron Oh good call, Network.setIsReady(true); is a better solution in order to have separation of concerns, I will update the PR with that if we agree this is the way to go.

However, this kind of begs the question of whether it matters whether we are "ready" or not?

Can you maybe walk us through step by step how the unexpected log out happens? And why your PR will fix it?

Sure, I reproduced the issue forcing the Onyx.connect callback (that retrieves the authToken) to take longer to resolve:

// in API.js: 
let authToken;
Onyx.connect({
    key: ONYXKEYS.SESSION,
    callback: val => setTimeout(() => authToken = val ? val.authToken : null, 100),
});

(the video attached to this comment shows how the log out happens; first I'm already logged in, then I add the setTimeout in the callback and then refresh the browser. And the log out happens because processNetworkRequestQueue, from Network.js, is called before the authToken is ready).

So I think this scenario could happen on a handset, where the AsyncStorage could take a bit longer to resolve the authToken (probably because of an excess of processes competing for limited hardware resources) but while that happens the Network.processNetworkRequestQueue (or API.Get) could be called causing the log out because there's not authToken yet.

Also, I'm not sure if this is a definitive fix because we still don't know 100% the root of the issue, but what I'm proposing is to have a sanity check of the authToken from Onyx when we make a network request and log something to the server if that scenario happens. If the issue is still occurring after implementing this, then we'll know 100% this was not the root cause of the issue 😄

@marcaaron
Copy link
Contributor

(the video attached to this comment shows how the log out happens; first I'm already logged in, then I add the setTimeout in the callback and then refresh the browser. And the log out happens because processNetworkRequestQueue, from Network.js, is called before the authToken is ready).

Ok, I think it makes sense now. The way we are testing this might be better if we delayed everything and not just the authToken subscription? Maybe there's some way to simulate the theory that storage values are being read too slow like add some optional "throttle" feature within Onyx? And we can write a test to simulate this happening.

Looking at the code and video I can see how we are getting logged out via the enhanceParameters call.

This is a pretty deep topic, but I have a feeling the right thing to do is:

  1. Wait until we have loaded both the authToken and credentials then set the Network module to "ready"
  2. Stop logging the user out in enhanceParameters - we should just let the http request happen since it should in theory
    • return a 407 jsonCode if we are online
    • attempt "authentication with stored credentials"
    • if we have no credentials then we will get logged out (but we were logged in and waited for onyx so we should have those even if we have an expired, bad, or non-existent authToken.

It would be good to do a historical review of why we started logging out in the enhanceParameters call (in case I am missing something). But I think the above would be a better way to handle things longer term. It also eliminates to a single event number of paths to a "forced log out".

@kidroca
Copy link
Contributor

kidroca commented Oct 27, 2021

Wait until we have loaded both the authToken and credentials
Stop logging the user out in enhanceParameters - we should just let the http request happen

I agree with this solution. By the time a request fails due to an auth token, we might already have read everything needed from storage, to attempt to reauthenticate

This seems to be the easiest thing that can be applied to the network queue to fix the problem

What I don't like about it though

  • it solves the problem by adding another if
  • this check is valid only at startup, but stays forever - each if makes the code harder and harder to follow
  • it solves the problem specifically for Networking and sets an example on how to deal with such issues - "What to do when I need an Onyx key that might not be read from storage yet - use flags"

I'm going to propose and idea in my next comment, we can discuss it further if it's sound to you

@kidroca
Copy link
Contributor

kidroca commented Oct 27, 2021

One of the widely accepted patterns for chain based handling is piping e.g. the middleware approach where you split the logic in stages and only go to the next stage when you call next() - Express is one of the earliest frameworks to popularize the pattern: https://expressjs.com/en/guide/using-middleware.html

A pattern like that allows us to compose stages and make them run as desired - sync or async - the next stage logic would only run when the previous stages has called next(). You can also abort the pipeline and start over

I think we can make our network queue a lot more predictable and easier to understand if we structure it in a pipe fashion. This would allow for better separation of concern, and would allow for an easier decoupling mechanism - instead of registering handlers that are put on a specific place of the existing control flow, we'll be appending or prepending stages to our pipeline

Here's how it would look like to solve this particular issue with the middleware approach

// Sample stages based on the current queue
const sateges = [
  // Connectivity check -> next()
  // Persist retriable request -> next()
  // Wait Storage to be ready -> next()
  // Enachance Params -> next()
  // Invoke Request -> next()
  // Happy Handler -> finish() 
  // Error Handler -> decide whether we should re-run stages or finish() 
];

// Promise based solution
function waitStorage(req, next) {
  return Onyx.get('session')
    .then(() => next(req))
    .then(() => stages.remove(waitStorage)) // we no longer need this stage for future pipeline runs
}

// Non promise based solution
function waitStorage(req, res)  {
  return next(
      new Error('Storage parameters are not yet initialized')
  )
}

Onyx.connect({
  key: 'session',
  callback: () => stages.remove(waitStorage);
 })
}

// Solution more similar to our typical handling:
function waitStorage(req, res)  {
  const isNetworkReady = authToken != undefined && credentials != undefined;

  if (isNetworkReady) {
      return next(req);
  }

  return next(
      new Error('Storage parameters are not yet initialized')
  )
}

If a stage should abort pipeline execution it can return a rejected promise (throw) or call next with an error
stages resembles a promise chain, but we can modify the chain at run time if we find it useful


Compare our existing queue with the above suggestion we can see some of our weaknesses

  • we try to decouple code by exposing registration methods, but this is just making the code harder to follow. Registering an isolated stage where you can trace exactly how the stage works is clearer and easy to follow
  • we try to do whatever is possible in sync, and finally do the request, we're dealing with race conditions by raising various flags to block and restart the queue instead of awaiting or work in a event driven way. Adding flags would work to an extent

Whenever we need to deal with a problem we introduce a direct change like adding an if inside the queue code which is a violation of a few SOLID principles

@kidroca
Copy link
Contributor

kidroca commented Oct 27, 2021

And we won't even need a waitForStorage stage if we go with something like this for the enhance stage

function enchanceParameters(req, next) {
  return Onyx.get('session')
    .then((session = {})=> next({
        ...req,
       params: {
          ...req.params,
          token: session.token,
          credentials: session.credentials,
       }
    }))
}

@marcaaron
Copy link
Contributor

That looks pretty neat @kidroca. I agree that the interval style of network queue could be improved or replaced with something better. It feels the relevant suggestion we could apply to this issue is "use Onyx.get()" and have all requests check for the session (instead of a "ready" flag).

I love the idea to get rid of the processQueue interval logic. But want to give the original idea a chance first. Also, would be more comfortable with a gradual evolution vs. a big re-design and share @tgolen's thoughts about writing tests and not introducing the Onyx.get() just yet.

we try to do whatever is possible in sync, and finally do the request, we're dealing with race conditions by raising various flags to block and restart the queue

I do think we are close to some improvements to the various flags here if we:

  • eliminate "pausing" and logging out when there is no authToken that will make the concept of "paused" less ambiguous and leave us with a singular concept of "authenticating"
  • wait for credentials + session before allowing the queue to process would set "readiness"

In both "ready" and "paused" case we should re-queue the requests. So we can just add the check here I think

App/src/libs/Network.js

Lines 112 to 113 in 66dc079

// If the queue is paused we will not make the request right now
return !isQueuePaused;

What if we:

  • Write tests to simulate the issue
  • Unblock this issue by introducing a concept of isReady and tell the Network module that it should not process until it's ready
  • Only sign out when we are unable to re-authenticate
  • Refactor the existing Network code so it is easier to understand
  • See if we still feel like the design is confusing or needs to be improved at that point

Thoughts?

@tgolen
Copy link
Contributor

tgolen commented Oct 27, 2021

@kidroca I like that proposal when we're at the point of refactoring the entire network lib! That could be really intriguing.

I think we might be there sooner rather than later :D but not sure we are there yet.

@marcochavezf
Copy link
Contributor

marcochavezf commented Oct 28, 2021

I already have a 1:1 with Marc, I'm going to add a unit test to simulate the issue first and then add the isReady flag on top of that.

It would be good to do a historical review of why we started logging out in the enhanceParameters call

  • Only sign out when we are unable to re-authenticate

I agree with these points ^^ and I will check that also too. Probably the log out in the enhanceParameters was introduced to fix a bug in the past but it could also be the culprit of the random logout.

@MelvinBot MelvinBot removed the Overdue label Oct 28, 2021
@kidroca
Copy link
Contributor

kidroca commented Oct 28, 2021

One other thing that might be playing a role in this issue/bug is when the app is launched from a background process

App/src/setup/index.js

Lines 8 to 22 in f97aebc

export default function () {
/*
* Initialize the Onyx store when the app loads for the first time.
*
* Note: This Onyx initialization has been very intentionally placed completely outside of the React lifecycle of the main App component.
*
* To understand why we must do this, you must first understand that a typical React Native Android application consists of an Application and an Activity.
* The project root's index.js runs in the Application, but the main RN `App` component + UI runs in a separate Activity, spawned when you call AppRegistry.registerComponent.
* When an application launches in a headless JS context (i.e: when woken from a killed state by a push notification), only the Application is available, but not the UI Activity.
* This means that in a headless context NO REACT CODE IS EXECUTED, and none of your components will mount.
*
* However, we still need to use Onyx to update the underlying app data from the headless JS context.
* Therefore it must be initialized completely outside the React component lifecycle.
*/
Onyx.init({

Could there be something that is triggering a request when the app is launched "behind the curtains" and it failing for some reason and logging out the user?
Or maybe the reason would be the same - storage keys weren't read at the time of the request

@marcaaron
Copy link
Contributor

Could there be something that is triggering a request when the app is launched "behind the curtains" and it failing for some reason and logging out the user?

@roryabraham might know the answer to that or have ideas on if it's something we can debug

@roryabraham
Copy link
Contributor

Could there be something that is triggering a request when the app is launched "behind the curtains"

It's possible. AFAICT the only API requests that can happen in this headless context are:

  • Log requests, which should always happen when the app is awakened in a headless context
    • Do we need credentials and an auth token for log requests?
  • There's also this, which will call an API.Get, but I don't think that should ever be triggered in a headless context for now, because we only call updateReportWithNewAction for REPORT_COMMENT actions.

@roryabraham
Copy link
Contributor

roryabraham commented Oct 28, 2021

One suggestion I don't think I've seen discussed in this issue...

In many locations in the app, we call Onyx.connect directly and store the result into a local in-memory variable. That works, but sometimes creates race conditions where we unsafely assume that the local variable is initialized. I know @kidroca has suggested we introduce an Onyx.get function that returns a promise so we can wait for those values to be loaded before we access them. But another approach would be to create a global store (or multiple stores), where we:

  1. Use Onyx.connect to load Onyx values from disk into local memory

  2. Then export a bunch of functions to get those values, along with a function that tells us if all the values are done loading from storage.

    let authToken;
    Onyx.connect({
        key: ONYXKEYS.SESSION,
        callback: val => authToken = val ? val.authToken : null,
    });
    const getAuthToken = () => authToken;
    
    let credentials;
    Onyx.connect({
        key: ONYXKEYS.CREDENTIALS,
        callback: val => credentials = val || null,
    });
    const getCredentials = () => credentials;
    
    const isStoreReady = () => (
        !_.isUndefined(authToken)
        && !_.isUndefined(credentials)
    );
    
    export {
        getAuthToken,
        getCredentials,
        isStoreReady,
    };
  3. Then make the global setup function that runs on startup return a promise that only resolves once the store is ready.

  4. Wait for the global setup function (and, by proxy, the Onyx store(s)) to be fully initialized before hiding the boot splash screen and rendering the app.

This solution comes with a downside that it would probably slow our startup, but will ensure that everything we need from Onyx is loaded when we go to use it. And we're only waiting for Onyx to load once, in one location, rather than peppering our codebase with Onyx.get calls that wait for a read-from-disk before proceeding.

@kidroca
Copy link
Contributor

kidroca commented Oct 28, 2021

That sounds similar to the multiGet idea where we read all the finite keys (everything but report actions) during Onyx.init

This solution comes with a downside that it would probably slow our startup

I think it might actually speed it up

Then export a bunch of functions to get those values, along with a function that tells us if all the values are done loading from storage.

Regarding race conditions this puts us at the same spot - we ask a function whether we're done loading - if we're done do one thing if we're not do something else and come again here after 1 sec.
With Onyx.get there's no if -> I want this value, then I want to do this thing

@kidroca
Copy link
Contributor

kidroca commented Oct 28, 2021

If we do something like preload everything on App launch, we should start the Network queue (and other side effect producing code) after Onyx.init is complete

Right now the Network queue starts immediately

App/src/libs/Network.js

Lines 232 to 233 in f97aebc

// Process our write queue very often
setInterval(processNetworkRequestQueue, 1000);

If for some reason we have read persisted request but not yet the token we'll make a request without a token

App/src/libs/Network.js

Lines 83 to 89 in f97aebc

// Subscribe to NETWORK_REQUEST_QUEUE queue as soon as Client is ready
ActiveClientManager.isReady().then(() => {
Onyx.connect({
key: ONYXKEYS.NETWORK_REQUEST_QUEUE,
callback: processOfflineQueue,
});
});

  • the native implementation of isReady resolves immediately
  • this is an example of side effects producing code and in this case it runs as soon as the script is loaded.

But otherwise the App or the part that is making requests AuthScreens is only loaded when the token is read


About side effects in scripts

We might think about identifying such items and only run them on command. As a web developer I would usually be schedule such scripts to execute from window.onload - this gives the app breathing space to load itself and the most critical items, then proceed with the next step.
For react native there are alternatives

  • a certain component can raise the ready event when it mounts
  • Onyx.init can be the event (if we return a promise from it)
  • NavigationContainer has an onReady callback for that purpose as well

@kidroca
Copy link
Contributor

kidroca commented Oct 28, 2021

Another side effect that is set up as soon as the script is parsed

Onyx.connect({
key: ONYXKEYS.MY_PERSONAL_DETAILS,
callback: (val) => {
if (!val) {
return;
}
const timezone = lodashGet(val, 'timezone', {});
const currentTimezone = moment.tz.guess(true);
// If the current timezone is different than the user's timezone, and their timezone is set to automatic
// then update their timezone.
if (_.isObject(timezone) && timezone.automatic && timezone.selected !== currentTimezone) {
timezone.selected = currentTimezone;
PersonalDetails.setPersonalDetails({timezone});
}
},
});

I think code like this is probably running in the "behind the curtains" launch (it's probably not supposed to)
A bit weird to find this inside AuthScreens...

@MelvinBot
Copy link

@marcochavezf Whoops! This issue is 2 days overdue. Let's get this updated quick!

@marcaaron
Copy link
Contributor

This conversation (while interesting) feels a bit tangential and not quite in context for "Android - User is logged out randomly". Does anyone mind if we maybe funnel some of this stuff into another thread and focus on solving the current problem here without triggering a big design change for now?

If I could take a stab at the problem statement I'd say something like:

Problem: Onyx.connect() updates local variables asynchronously wherever it is used. This can lead to incorrect assumptions about when a certain local variable will be "ready" or "not ready" and therefore encourages race conditions (inconsistent results) or confusing temporal dependencies (do x but only if y is "ready").

@marcochavezf
Copy link
Contributor

marcochavezf commented Nov 1, 2021

This conversation (while interesting) feels a bit tangential and not quite in context for "Android - User is logged out randomly". Does anyone mind if we maybe funnel some of this stuff into another thread and focus on solving the current problem here without triggering a big design change for now?

If I could take a stab at the problem statement I'd say something like:

Problem: Onyx.connect() updates local variables asynchronously wherever it is used. This can lead to incorrect assumptions about when a certain local variable will be "ready" or "not ready" and therefore encourages race conditions (inconsistent results) or confusing temporal dependencies (do x but only if y is "ready").

Sounds good to me, today and tomorrow I will continue with what we agreed to tackle the logout issue: unit tests, isReady flag, and stop logging out in enhanceParameters (also I'm going to investigate why the log out was added there).

For the isReady flag, I like the idea that @roryabraham proposed in point 2 (create a store for auth and credentials) because 1. separates concerns, 2. the store can be used as a single source of truth in both API.js and Network.js and 3. it's not a big change (only implementing the store).

Also, I think we could use that flag to re-queue the requests if the store (auth + credentials) is not ready, that will solve the issue that @kidroca commented where we have read persisted requests but the auth token is not ready.

@MelvinBot MelvinBot removed the Overdue label Nov 1, 2021
@marcaaron
Copy link
Contributor

For the isReady flag, I like the idea that @roryabraham proposed in point 2 (create a store for auth and credentials) because 1. separates concerns, 2. the store can be used as a single source of truth in both API.js and Network.js and 3. it's not a big change (only implementing the store).

Just going to quickly restate something I said earlier:

the Network should not have any awareness of the authToken. Network ideally should be a simple queue and handler of requests passing them back to the API module so we can reauthenticate, etc.

I'm curious how the store solution addresses this? Will we be calling getAuthToken() inside the Network code?

I am admittedly getting distracted by the suggestion to create such a store for all the various Onyx keys and then export getters. I'd prefer to discuss the problem here a bit more before establishing this pattern.

@marcochavezf
Copy link
Contributor

marcochavezf commented Nov 2, 2021

the Network should not have any awareness of the authToken. Network ideally should be a simple queue and handler of requests passing them back to the API module so we can reauthenticate, etc.

I'm curious how the store solution addresses this? Will we be calling getAuthToken() inside the Network code?

Ah, I was thinking to use instead the isStoreReady in processNetworkRequestQueue (Network) like this:

if (!isStoreReady()) {
    requestsToProcessOnNextRun.push(queuedRequest);
    return;
}

Where the requests are retried (even persisted requests) if both auth token and credentials are not ready.

I am admittedly getting distracted by the suggestion to create such a store for all the various Onyx keys and then export getters. I'd prefer to discuss the problem here a bit more before establishing this pattern.

Yeah, I will stick to the Network.setIsReady(true) solution atm (which will behave similarly to isStoreReady()) to avoid more noise here and then discuss later the store pattern in the other GH issue 👍🏽

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Daily KSv2 Engineering Reviewing Has a PR in review
Projects
None yet
Development

No branches or pull requests

8 participants