Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core-lro] Proposed LRO Engine design #15949

Merged
merged 33 commits into from
Jul 13, 2021
Merged

Conversation

deyaaeldeen
Copy link
Member

@deyaaeldeen deyaaeldeen commented Jun 24, 2021

The PR description just rehashes the design document.

Related to #15775

Modular Support for Long-Running Operations

Long-running operations (LROs) are operations that the service could take a long time to finish processing and they follow a common convention:

  • the customer first send an initiation request to the service, which in turn sends back a response, from which the customer can learn how to poll for the status of the operation, if it has not been completed already,
  • using their learnings, the customer polls the status of the operation until it is done,
  • again, using their learnings, the customer can now get the desired result of the operation once its status says it is done.

Ideally, we can write an algorithm that implements this convention once and use it in all Azure clients for LRO APIs, however, in reality, this convention is implemented differently across Azure services. The good news is that the TypeScript Autorest extension is AFAIK able to generate code that implements those different ones, but this implementation has a couple of limitations:

  1. it is located in a few files that the code generator copies into every generated package that has LROs. So if in the future there is a bug fix needed in the LRO logic, the package has to be regenerated with the fix.
  2. it works only with clients that use @azure/core-client, so clients that use @azure-rest/core-client or @azure/core-http can not use this implementation as-is.

To fix limitation #1, the most straightforward thing to do is to move those files into @azure/core-lro, but without fixing limitation #2 first, @azure/core-lro will have to depend on @azure/core-client in this case which will force clients that depend on @azure/core-lro but not necessarily depend on @azure/core-client to transitively depend on the latter, posing concerns about unnecessarily larger bundle sizes.

This document presents a design that fixes limitation #2 and naturally fixes limitation #1 too.

Things to know before reading

  • Some details not related to the high-level concept are not illustrated; the scope of this is limited to the high-level shape and paradigms for the feature area.

Terminology

  • Azure Async Operation, Body, and Location are names for the LRO implementations currently supported in the TypeScript Autorest extension. They vary in how to calculate the path to poll from, the algorithm for detecting whether the operation has finished, and the location to retrieve the desired results from. Currently, these pieces of information can be calculated from the response received after sending the initiation request.

Why this is needed

The China team is currently waiting for fixing limitation #1 which they regard as a blocker for GAing the TypeScript Autorest extension. Furthermore, having this LRO implementation being part of @azure/core-lro and not tied to @azure/core-client will significantly help streamline the underway effort to add convenience helpers for LROs in @azure-rest clients.

Design

This document presents a design of an LRO engine to be part of @azure/core-lro and could be used by any client regardless of how it is implemented. Furthermore, specific implementations of the engine are also provided to be auto-generated by Autorest.

The design consists of three main pieces:

  • an interface, named LongRunningOperation<T> which groups various primitives needed to implement LROs
  • a class, named LroEngine, that implements the LRO engine and its constructor takes as input an object that implements LongRunningOperation<T>
  • a class that implement LongRunningOperation<T> that works with clients that use either @azure/core-http and @azure/core-client. @joheredi also created one for @azure-rest/core-client in [REST Clients] Add lro poller helper #15898

LongRunningOperation<T>

This interface contains two methods: sendInitialRequest and sendPollRequest.

sendInitialRequest

This method should be implemented to send the initial request to start the operation and it has the following signature:

sendInitialRequest: () => Promise<LroResponse<T>>

The method does not take the path or the HTTP request method as parameters because they're members of the interface since they're needed to control many aspects of subsequent polling. This is how this method can be implemented:

public async sendInitialRequest(): Promise<LroResponse<T>> {
  return this.sendOperation(this.args, this.spec); // the class will have sendOperation, args, and spec as private fields
}

sendPollRequest

This method should be implemented to send a polling (GET) request, a request the service should respond to with the current status of the operation, and it has the following signature:

sendPollRequest: (path: string) => Promise<LroResponse<T>>;

This method takes the polling path as input and here is what a simplified implementation would look like:

  public async sendPollRequest(path: string): Promise<LroResponse<T>> {
    return this.sendOperationFn(this.args, { // the class will have sendOperation, args, and spec as private fields
      ...this.spec,
      path,
      httpMethod: "GET")
    });
  }

LroEngine

This class implements the PollerLike interface and does the heavy lifting for LROs and has the following type signature:

class LroEngine<TResult, TState extends PollOperationState<TResult>> extends Poller<TState, TResult>

The class also has the following constructor:

constructor(lro: LongRunningOperation<TResult>, options?: LroEngineOptions);

Currently options have intervalInMs to control the polling interval, resumeFrom to enable resuming from a serialized state, and lroResourceLocationConfig which could determine where to find the results of the LRO after the operation is finished. Typically, Autorest figures out the value for LroResourceLocationConfig from the x-ms-long-running-operation-options swagger extension. If there are new arguments to be added to the class, they could be added to the options type.

LroImpl

This class implements the LongRunningOperation<T> interface and is auto-generated by Autorest. LroImpl needs access to a few pieces: operation specification and operation arguments and a primitive function that can take them as input to send a request and converts the received response into one of type LroResponse<T> which has both the flattened and the raw responses.

Usage examples

Create an object of LroImpl

const directSendOperation = async (
  args: OperationArguments,
  spec: OperationSpec
): Promise<unknown> => {
  return this.client.sendOperationRequest(args, spec);
};
const sendOperation = async (
  args: OperationArguments,
  spec: OperationSpec
) => {
  let currentRawResponse: FullOperationResponse | undefined = undefined;
  const providedCallback = args.options?.onResponse;
  const callback: RawResponseCallback = (
    rawResponse: FullOperationResponse,
    flatResponse: unknown
  ) => {
    currentRawResponse = rawResponse;
    providedCallback?.(rawResponse, flatResponse);
  };
  const updatedArgs = {
    ...args,
    options: {
      ...args.options,
      onResponse: callback
    }
  };
  const flatResponse = await directSendOperation(updatedArgs, spec);
  return {
    flatResponse,
    rawResponse: {
      statusCode: currentRawResponse!.status,
      body: currentRawResponse!.parsedBody,
      headers: currentRawResponse!.headers.toJSON()
    }
  };
};

const lro = new LroImpl(
  sendOperation,
  { options }, // arguments are just the operation options
  spec
);

Using LroEngine

const pollerEngine = new LroEngine(lro, { intervalInMs: 2000 }); // lro was instantiated in the previous section
const result = pollerEngine.pollUntilDone();

Testing

We have extensive test suite for LROs in the TypeScript code generator repo. I both added those tests here and re-implemented the lro routes in the Autorest test server. For this to work, I created a fairly low-level instantiation for LongRunningOperation<T> with just @azure/core-rest-pipeline.

@ghost ghost added the Azure.Core label Jun 24, 2021
@deyaaeldeen deyaaeldeen requested a review from xirzec June 24, 2021 16:27
Copy link
Member

@xirzec xirzec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this!

Overall, I think this is a good strategy, though I am curious how many packages are using LROs and core-http still - will some of this abstraction cause unnecessary complexity once we're migrated fully off of core-http?

Most of my comments are clarifying or style nits.

sdk/core/core-lro/docs/LROEngine.md Outdated Show resolved Hide resolved
sdk/core/core-lro/review/core-lro.api.md Outdated Show resolved Hide resolved
sdk/core/core-lro/review/core-lro.api.md Outdated Show resolved Hide resolved
sdk/core/core-lro/review/core-lro.api.md Outdated Show resolved Hide resolved
sdk/core/core-lro/src/lroPoller/azureAsyncPolling.ts Outdated Show resolved Hide resolved
sdk/core/core-lro/src/lroPoller/models.ts Outdated Show resolved Hide resolved
sdk/core/core-lro/review/core-lro.api.md Outdated Show resolved Hide resolved
sdk/core/core-lro/src/lroPoller/locationPolling.ts Outdated Show resolved Hide resolved
sdk/core/core-lro/docs/LROEngine.md Outdated Show resolved Hide resolved
sdk/core/core-lro/docs/LROEngine.md Show resolved Hide resolved
@deyaaeldeen deyaaeldeen requested a review from xirzec June 25, 2021 00:34
@deyaaeldeen
Copy link
Member Author

@xirzec

though I am curious how many packages are using LROs and core-http still

AFAIK, none. core-http is used in Track 2 packages and we write LROs there by hand. Hopefully the recent improvements to auto-generated LROs will change this.

will some of this abstraction cause unnecessary complexity once we're migrated fully off of core-http?

We already see the abstraction in action for core-client and core-http in Azure/autorest.typescript#1043 and I do not think there is any complexity due to core-http or its phasing out. In fact, there is slight complexity because of the dancing around the callback mechanism in core-client instead, (e.g. the need for the initializeState param to sendInitialRequest that returns whether the LRO has finished so that the customer-provided callback can be called at this point).

@deyaaeldeen deyaaeldeen marked this pull request as ready for review June 30, 2021 03:01
@Azure Azure deleted a comment from check-enforcer bot Jul 1, 2021
@sadasant
Copy link
Contributor

sadasant commented Jul 6, 2021

Please don’t merge this until @xirzec , @chradek or @bterlson have approved this.

@deyaaeldeen
Copy link
Member Author

deyaaeldeen commented Jul 10, 2021

@bterlson and @xirzec I simplified the design a lot in 3fb9120. This time the client code has to implement two methods only instead of three: sendInitialRequest and sendPollRequest and I ditched retrieveAzureAsyncResource by replacing it with sendPollRequest. Furthermore, sendPollRequest was simplified to take the request path and an isDone predicate as input. The isDone predicate is useful for @azure/core-client clients as explained in the design document. I removed the LroMode parameter after I realized I can build a sendPollRequest without it and make it pass all the tests. You can see the design in action in code gen here: Azure/autorest.typescript#1099.

This change dramatically decreases the public surface because many types and helper functions are no longer needed: 3fb9120#diff-6f92516f198f704d6b14b9017fa3f9acdb4d30dc2753cc1fcb51eee9704f30dc. Please give it another look.

Copy link
Member

@xirzec xirzec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very excited about this PR! There are just a couple things that I would like to get some clarity on before signing off.

sdk/core/core-lro/src/lroEngine/azureAsyncPolling.ts Outdated Show resolved Hide resolved
sdk/core/core-lro/src/lroEngine/lroEngine.ts Outdated Show resolved Hide resolved
sdk/core/core-lro/src/lroEngine/lroEngine.ts Outdated Show resolved Hide resolved
sdk/core/core-lro/src/lroEngine/lroEngine.ts Outdated Show resolved Hide resolved
sdk/core/core-lro/src/lroEngine/operation.ts Outdated Show resolved Hide resolved
sdk/core/core-lro/src/lroEngine/operation.ts Outdated Show resolved Hide resolved
sdk/core/core-lro/src/lroEngine/requestUtils.ts Outdated Show resolved Hide resolved
sdk/core/core-lro/docs/LROEngine.md Outdated Show resolved Hide resolved
@deyaaeldeen deyaaeldeen requested a review from xirzec July 12, 2021 22:14
@deyaaeldeen deyaaeldeen requested a review from xirzec July 13, 2021 00:58
Copy link
Member

@xirzec xirzec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel great about the new design! Nice work.

if (isUnexpectedPollingResponse(rawResponse) || failureStates.includes(state)) {
throw new Error(`The long running operation has failed. The provisioning state: ${state}.`);
}
return successStates.includes(state);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little curious about why failure is exceptional instead of being viewed as a final state, like success. Is failure never expected to happen normally, so this method is really just disambiguating between success and in progress?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code gen crew agreed on throwing if the operation reached a failure state and this function just implements this expectation. I think it is mostly because other languages started this way so JS has to be consistent with them. To me, I think this is not the best thing to do because all the customer gets is an exception message which I think it makes rehydration harder.

@deyaaeldeen deyaaeldeen merged commit a616f67 into Azure:main Jul 13, 2021
@deyaaeldeen deyaaeldeen deleted the lro-design branch July 13, 2021 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants