[API Proposal]: System.ClientModel long-running operation base type `OperationResult` #106197

annelo-msft · 2024-08-09T16:48:38Z

Background and motivation

The System.ClientModel library provides building blocks for .NET clients that call cloud services. This addition supports client APIs for long-running operations, i.e. service operations where the service returns a response before the requested operation has completed. For such operations, SCM-based clients return a subclient derived from SCM's abstract OperationResult class, which allows users to monitor the operation's status and obtain any resulting value.

The new OperationResult type is similar to the Operation and Operation<T> types in Azure.Core, but has fewer public APIs in ordre to support greater variation of implementation patterns in the third-party cloud service space. It serves as a base type for public operation subclients such as the CreateVectorStoreOperation type in the .NET OpenAI client library. It provides APIs to wait for the operation to complete processing on the service, and a rehydration token that can be to "rehydrate" an operation in progress, e.g. to obtain the value computed by a long-running operation from a different process than the one that started the operation. Client subtypes add public properties such as Value and Status as applicable to the operation implementation.

API Proposal

namespace System.ClientModel.Primitives
{
    public abstract partial class OperationResult : System.ClientModel.ClientResult
    {
        protected OperationResult(System.ClientModel.Primitives.PipelineResponse response) { }
        public abstract bool IsCompleted { get; protected set; }
        public abstract System.ClientModel.ContinuationToken? RehydrationToken { get; protected set; }
        public abstract void WaitForCompletion(System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken));
        public abstract System.Threading.Tasks.Task WaitForCompletionAsync(System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken));
    }
}

API Usage

Example usage of derived CreateVectorStoreOperation type in OpenAI client:

VectorStoreClient client = new VectorStoreClient(new OpenAIClientOptions());
CreateVectorStoreOperation operation = await client.CreateVectorStoreAsync(waitUntilCompleted: false);
await operation.WaitForCompletionAsync();
VectorStore vectorStore = operation.Value;

Alternative Designs

No response

Risks

No response

The text was updated successfully, but these errors were encountered:

stephentoub · 2024-08-09T17:42:15Z

Generally looks reasonably, but I'm wondering about the IsCompleted property. What's the use case for that, and does the value change after the result has been constructed? e.g. is that actually going to make networking requests to determine whether the operation has completed? If it doesn't, then I worry it'll be confusing that IsCompleted isn't up-to-date. And if it does, then I worry that there's expensive work happening behind a property, possibly expensive asynchronous work.

stephentoub · 2024-08-09T17:45:49Z

Also, for the usage example, what happens if instead of:

VectorStoreClient client = new VectorStoreClient(new OpenAIClientOptions());
CreateVectorStoreOperation operation = await client.CreateVectorStoreAsync(waitUntilCompleted: false);
await operation.WaitForCompletionAsync();
VectorStore vectorStore = operation.Value;

I end up writing:

VectorStoreClient client = new VectorStoreClient(new OpenAIClientOptions());
CreateVectorStoreOperation operation = await client.CreateVectorStoreAsync(waitUntilCompleted: false);
VectorStore vectorStore = operation.Value;

which is what I'd naturally be inclined to do. Does Value throw an exception if it's not yet completed? Does it block until the operation completes? I assume waitUntilCompleted defaults to true in general, so by default developers don't need to do the wait explicitly and the "right things" just happen?

terrajobst · 2024-08-09T18:03:18Z

How does the operation know that it's completed? Is it push based (i.e. does the server tell the client "I'm done") or is it polling based?

annelo-msft · 2024-08-09T22:31:20Z

I'm wondering about the IsCompleted property. What's the use case for that, and does the value change after the result has been constructed?

Yes, the value of IsCompleted will typically change after the result has been constructed. The property itself doesn't make any service calls (we avoid this from properties because of the difficulty making async calls there), but if a caller passes waitUntilCompleted: false, the intended use of the returned operation subclient is to call the WaitForCompletion method prior to accessing a Value property, if the intention is to access the final output value from the operation. To answer @terrajobst's question, many LROs are polling-based, so WaitForCompletion will poll a service endpoint to obtain the operation status until the operation has completed, and then it will set IsCompleted to true before returning.

IsCompleted is available to check whether the operation has completed, e.g. in a rehydration scenario. Another use case for it is that, in cases where the public subtype adds an UpdateStatus method (or similar), a user can implement their own polling loop (e.g. to use a custom polling interval) by doing:

while (!operation.IsCompleted)
{
    // delay custom interval

    operation.UpdateStatus();    
}

Does Value throw an exception if it's not yet completed? Does it block until the operation completes?

It does not. Value should be a nullable property when it appears on an OperationResult subtype, so if the service operation doesn't provide a response containing a value until the operation has completed, a correct implementation would return null from the property until IsCompleted is true. Some services return responses containing values that represent intermediate progress -- OpenAI does this, for example -- and for these, Value is populated with the intermediate result because it may contain information of value to the end user.

I assume waitUntilCompleted defaults to true in general, so by default developers don't need to do the wait explicitly and the "right things" just happen?

It actually doesn't. We made the design decision in Azure clients to require a user to pass a value for this parameter because it was hard to pick a default that would work across all LRO cases, and we are planning to follow the same design for SCM-based clients. For example, some LROs can take up to 24 hours to complete, and we didn't want user code to block for that long without explicit opt-in to that. Other LROs can return in a few seconds, so we also didn't want to force callers to write extra lines of code to WaitForCompletion in those cases. At the end of the day, we felt it was better to require the end-user to pass a parameter opting-in to their preference, since they should have a better sense of the characteristics of a given operation, and would need to write different code subsequent to the start-operation call depending on the value they passed. (As a side-note, in Azure.Core, we have a separate WaitUntil enum we use for this parameter, but we felt we didn't need an extra type for it in SCM.)

stephentoub · 2024-08-10T12:03:29Z

Thanks. IsCompleted is still troublesome to me. I'd expect to be able to poll it in a loop like:

while (!IsCompleted)
{
   DoOtherThings();
}

but it sounds like that'll likely hang because I'm not calling am explicit update that doesn't exist on the base where IsCompleted is. Does IsCompleted need to exist on the base? Or should it be a method that also has an async counterpart that can actually poll the service if not completed?

terrajobst · 2024-08-12T23:28:46Z

@stephentoub

Or should it be a method that also has an async counterpart that can actually poll the service if not completed?

So you're proposing something like this:

while (!(await client.IsCompletedAsync())
{
    DoOtherThings();
}

That feels awkward. I'd probably invert the Boolean to avoid the parentheses:

while (await client.IsPendingAync())
{
    DoOtherThings();
}

But I guess if your primary concern is that it's easy to write the sync loop and dead lock, then maybe we should detect that pattern in an analyzer.

annelo-msft · 2024-08-12T23:59:03Z

I agree that we may not want to make IsCompleted a method.

@stephentoub, is your concern primarily that the IsCompleted property doesn't make sense without having an UpdateStatus method to set it on the base type? If so, would it be addressed by adding UpdateStatus to the base type? That feels like a reasonable addition, and lets us layer WaitForCompletion over UpdateStatus as we do today in Azure.Core.

stephentoub · 2024-08-13T20:01:50Z

is your concern primarily that the IsCompleted property doesn't make sense without having an UpdateStatus method to set it on the base type?

That's a key part of it. IsCompleted can't be anything other than a snapshot answer unless we're going to do sync-over-async (or there's some kind of polling happening asynchronously in the background that's updating it), but that's not obvious to a consumer, to whom it looks like this should always be up-to-date, just like Task. In fact, I'd guess that Task will be the thing folks use to reason about this, but that mental model is wrong. So as proposed a0 it's not clear to me you can do anything meaningful with IsCompleted (if you were checking it to know whether you need to call Wait, you could just call Wait), b) it seems like it's a pit of failure for someone who thinks it's always up-to-date, and c) without a corresponding way to update it on the abstraction it seems like a leaky abstraction.

If so, would it be addressed by adding UpdateStatus to the base type?

Either moving IsCompleted to the derived types or moving UpdateStatus to the base type. If we believe that IsCompleted+UpdateStatus+Wait are sufficient for the 99% use cases, then moving UpdateStatus down seems reasonable, along with thoroughly documenting IsCompleted as being a snapshot answer that's only ever updated by a call to UpdateStatus/Wait. I'd also wonder at that point why IsCompleted needs to be abstract, rather than just having a protected setter; I think that also helps a bit to communicate that there's no meaningful work being done in it, whereas having it be abstract might hint that it's actually being implemented by the derived type to call out to the service. The other approach would be fine, too; if you need the derived type to be successful in performing updates in many cases, then just move IsCompleted to that derived type as well.

annelo-msft added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Aug 9, 2024

dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 9, 2024

dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Aug 9, 2024

teo-tsirpanis added area-System.Net and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Aug 11, 2024

This comment was marked as off-topic.

Sign in to view

terrajobst added area-System.ClientModel and removed area-System.Net untriaged New issue has not been triaged by the area owner labels Aug 12, 2024

This comment was marked as off-topic.

Sign in to view

terrajobst mentioned this issue Aug 12, 2024

Add area ownership for System.ClientModel #106281

Merged

annelo-msft mentioned this issue Aug 13, 2024

Add abstract UpdateStatus methods to OperationResult Azure/azure-sdk-for-net#45475

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[API Proposal]: System.ClientModel long-running operation base type `OperationResult` #106197

[API Proposal]: System.ClientModel long-running operation base type `OperationResult` #106197

annelo-msft commented Aug 9, 2024

stephentoub commented Aug 9, 2024

stephentoub commented Aug 9, 2024

terrajobst commented Aug 9, 2024

annelo-msft commented Aug 9, 2024

stephentoub commented Aug 10, 2024 •

edited by karelz

Loading

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

terrajobst commented Aug 12, 2024 •

edited

Loading

annelo-msft commented Aug 12, 2024

stephentoub commented Aug 13, 2024

[API Proposal]: System.ClientModel long-running operation base type OperationResult #106197

[API Proposal]: System.ClientModel long-running operation base type OperationResult #106197

Comments

annelo-msft commented Aug 9, 2024

Background and motivation

API Proposal

API Usage

Alternative Designs

Risks

stephentoub commented Aug 9, 2024

stephentoub commented Aug 9, 2024

terrajobst commented Aug 9, 2024

annelo-msft commented Aug 9, 2024

stephentoub commented Aug 10, 2024 • edited by karelz Loading

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

terrajobst commented Aug 12, 2024 • edited Loading

annelo-msft commented Aug 12, 2024

stephentoub commented Aug 13, 2024

[API Proposal]: System.ClientModel long-running operation base type `OperationResult` #106197

[API Proposal]: System.ClientModel long-running operation base type `OperationResult` #106197

stephentoub commented Aug 10, 2024 •

edited by karelz

Loading

terrajobst commented Aug 12, 2024 •

edited

Loading