Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

Remove synchronous blocking frontend backend interactions #180

Merged
merged 2 commits into from
Jan 11, 2020

Conversation

gene1wood
Copy link
Contributor

@gene1wood gene1wood commented Jan 10, 2020

  • Removed synchronous blocking frontend backend interactions and instead this now relies entirely on the pollState frontend polling loop
    • Previously there was a mix of frontend backend interactions which were async and one which blocked and waited for other async calls to finish
    • This previous blocking interaction was when the frontend called "/redirect_callback". handle_oidc_redirect_callback
      would then call login.exchange_token_for_credentials which would loop and wait for a separate POST from the frontend to /api/roles to
      complete before login.exchange_token_for_credentials would continue and allow /redirect_callback to return a response to the frontend.
    • This commit removes this behavior so that all frontend to backend interactions are async and the frontends movement through the workflow
      is entirely governed by the changes to the login.state value
    • This new functionality is achieved with the following changes
      • login.exchange_token_for_credentials
        • The method no longer loops, sleeping and waiting for self.role_arn to be set
        • The method now calls self.print_output if it succeeds at getting STS credentials instead of handle_oidc_redirect_callback or login.login
          triggering this as was done previously
        • The method now handles all error cases that it could encounter on its own and returns either "error" if there was a problem, "finished"
          if it succeeded and is done, or "restart_auth" if AWS rejected the ID token and the tool should redirect the user back to the IdP
          to get a new ID token.
      • In cases where the user doesn't provide a role_arn, the set_role function (which is called when the user clicks the role they want and a POST
        is made to /api/roles) is what initiates the login.exchange_token_for_credentials call. This way the user selecting a role is what directly kicks
        off the sequence that ends with credentials being printed on the CLI
  • Previously this long running synchronous call to /redirect_callback also kept an eye on how long it had been since the frontend had last made a call
    to /api/state. The way it did this was that while the backend looped, waiting on login.role_arn to get set, it would on each loop check when /api/state
    had last been called. Since we no longer have this loop, this commit adds a new "heartbeat" backend endpoint
    • The "/api/heartbeat" endpoint starts an async long running call that loops and watches to make sure the frontend is awake and polling
    • /api/heartbeat responds every 30 seconds at which point a new call to /api/heartbeat is started. This is just to ensure the browser doesn't give up
      on waiting for the response from the endpoint.
    • This heartbeat call runs in parallel to everything else watching how long it's been since the last call to /api/state and if it's been too long
      the listener kills itself
    • In order to maintain the prompt identification (within 2 seconds) by the CLI that the user has closed the frontend window prematurely and still
      accommodate potentially slow calls to external data sources (like Auth0 and the ID token for role endpoint) that could take longer than 2 seconds,
      this commit makes the max_sleep_no_state_check value dynamic. Previously the fact that external data sources would sometimes take longer than 2
      seconds was not an issue because multiple pollStates could be running at the same time. Now with the state.backendInProgress described below, only
      a single pollState will be running at any given time and so slow responses need to be considered.
      • This commit solves this problem by watching the login.state at each call to /api/state and if a login.state is about to be returned which precedes
        an external data source call, the max_sleep_no_state_check is increased from 2 to 10 seconds.
      • Then later when the login.state changes to something not requiring external calls, the max_sleep_no_state_check is reduced back to 2 seconds.
  • This also adds a new behavior in the frontend to accommodate the fully async model which is to track when each run of pollState begins and ends. This
    is to ensure that if one run of pollState takes longer than "sleepTime", a second concurrent execution of pollState won't start.
    This is achieved with the new global state value, state.backendInProgress. When the pollState starts, this is set to prevent a later pollState from starting
    (as the first pollState hasn't completed yet).
  • Instead of killing the flask process in response to error conditions, now the tool instead responds to the frontend to call the /shutdown endpoint
    and that endpoint is responsible for killing the listener. This will remove the potentially scary scenario where the continued execution of logic
    in the tool after an error condition is only prevented by the process being killed instead of returning cleanly and then killing the process.
    To achieve this, some methods that previously returned nothing, now return values to either finish execution early in the case of an error or
    to communicate success.
  • In this new model where error cases cause backend calls to complete cleanly, and then for the frontend to call /shutdown, I've removed the "sleep(3600)"
    statements scattered through the code as this isn't needed any more
  • Add click check for the case where a user passes --batch but does not provide a --role-arn as well and exit with a usage complaint
  • Move the display of the "select a role" message in the frontend to after the role list has been fetched to avoid showing the user a request to select
    a role when there are no roles to select from

* Removed synchronous blocking frontend backend interactions and instead this now relies entirely on the pollState frontend polling loop
  * Previously there was a mix of frontend backend interactions which were async and one which blocked and waited for other async calls to finish
  * This previous blocking interaction was when the frontend called "/redirect_callback". handle_oidc_redirect_callback
    would then call login.exchange_token_for_credentials which would loop and wait for a separate POST from the frontend to /api/roles to
    complete before login.exchange_token_for_credentials would continue and allow /redirect_callback to return a response to the frontend.
  * This commit removes this behavior so that all frontend to backend interactions are async and the frontends movement through the workflow
    is entirely governed by the changes to the login.state value
  * This new functionality is achieved with the following changes
    * login.exchange_token_for_credentials
      * The method no longer loops, sleeping and waiting for self.role_arn to be set
      * The method now calls self.print_output if it succeeds at getting STS credentials instead of handle_oidc_redirect_callback or login.login
        triggering this as was done previously
      * The method now handles all error cases that it could encounter on its own and returns either "error" if there was a problem, "finished"
        if it succeeded and is done, or "restart_auth" if AWS rejected the ID token and the tool should redirect the user back to the IdP
        to get a new ID token.
    * In cases where the user doesn't provide a role_arn, the set_role function (which is called when the user clicks the role they want and a POST
      is made to /api/roles) is what initiates the login.exchange_token_for_credentials call. This way the user selecting a role is what directly kicks
      off the sequence that ends with credentials being printed on the CLI
* Previously this long running synchronous call to /redirect_callback also kept an eye on how long it had been since the frontend had last made a call
  to /api/state. The way it did this was that while the backend looped, waiting on login.role_arn to get set, it would on each loop check when /api/state
  had last been called. Since we no longer have this loop, this commit adds a new "heartbeat" backend endpoint
  * The "/api/heartbeat" endpoint starts an async long running call that loops and watches to make sure the frontend is awake and polling
  * /api/heartbeat responds every 30 seconds at which point a new call to /api/heartbeat is started. This is just to ensure the browser doesn't give up
    on waiting for the response from the endpoint.
  * This heartbeat call runs in parallel to everything else watching how long it's been since the last call to /api/state and if it's been too long
    the listener kills itself
  * In order to maintain the prompt identification (within 2 seconds) by the CLI that the user has closed the frontend window prematurely and still
    accommodate potentially slow calls to external data sources (like Auth0 and the ID token for role endpoint) that could take longer than 2 seconds,
    this commit makes the max_sleep_no_state_check value dynamic. Previously the fact that external data sources would sometimes take longer than 2
    seconds was not an issue because multiple pollStates could be running at the same time. Now with the state.backendInProgress described below, only
    a single pollState will be running at any given time and so slow responses need to be considered.
    * This commit solves this problem by watching the login.state at each call to /api/state and if a login.state is about to be returned which precedes
      an external data source call, the max_sleep_no_state_check is increased from 2 to 10 seconds.
    * Then later when the login.state changes to something not requiring external calls, the max_sleep_no_state_check is reduced back to 2 seconds.
* This also adds a new behavior in the frontend to accommodate the fully async model which is to track when each run of pollState begins and ends. This
  is to ensure that if one run of pollState takes longer than "sleepTime", a second concurrent execution of pollState won't start.
  This is achieved with the new global state value, state.backendInProgress. When the pollState starts, this is set to prevent a later pollState from starting
  (as the first pollState hasn't completed yet).
* Instead of killing the flask process in response to error conditions, now the tool instead responds to the frontend to call the /shutdown endpoint
  and that endpoint is responsible for killing the listener. This will remove the potentially scary scenario where the continued execution of logic
  in the tool after an error condition is only prevented by the process being killed instead of returning cleanly and then killing the process.
  To achieve this, some methods that previously returned nothing, now return values to either finish execution early in the case of an error or
  to communicate success.
* In this new model where error cases cause backend calls to complete cleanly, and then for the frontend to call /shutdown, I've removed the "sleep(3600)"
  statements scattered through the code as this isn't needed any more
* Add click check for the case where a user passes --batch but does not provide a --role-arn as well and exit with a usage complaint
* Move the display of the "select a role" message in the frontend to after the role list has been fetched to avoid showing the user a request to select
  a role when there are no roles to select from
@gene1wood gene1wood requested a review from april January 10, 2020 00:05
Copy link
Contributor

@april april left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review.

mozilla_aws_cli/login.py Show resolved Hide resolved
mozilla_aws_cli/listener.py Show resolved Hide resolved
mozilla_aws_cli/static/index.js Outdated Show resolved Hide resolved
mozilla_aws_cli/static/index.js Outdated Show resolved Hide resolved
mozilla_aws_cli/login.py Show resolved Hide resolved
mozilla_aws_cli/login.py Show resolved Hide resolved
mozilla_aws_cli/login.py Show resolved Hide resolved
mozilla_aws_cli/login.py Show resolved Hide resolved
mozilla_aws_cli/login.py Show resolved Hide resolved
mozilla_aws_cli/static/index.js Outdated Show resolved Hide resolved
@mozilla-iam mozilla-iam deleted a comment from april Jan 10, 2020
@gene1wood gene1wood merged commit 1b974f8 into mozilla-iam:master Jan 11, 2020
@gene1wood gene1wood deleted the remove_mixed_sync_async branch January 11, 2020 00:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants