Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add text around what brokers should do upon failures #354

Merged
merged 1 commit into from
Jan 30, 2018

Conversation

duglin
Copy link
Contributor

@duglin duglin commented Oct 27, 2017

Provide more guidance for what should happen when things go wrong
so we can ensure a more consistent semantics and interop.

This should only be reviewed after #353 - so only look at the 2nd commit.

Signed-off-by: Doug Davis dug@us.ibm.com

@cfdreddbot
Copy link

Hey duglin!

Thanks for submitting this pull request! I'm here to inform the recipients of the pull request that you and the commit authors have already signed the CLA.

@duglin
Copy link
Contributor Author

duglin commented Nov 7, 2017

On hold until we resolve #353

spec.md Outdated
@@ -718,6 +718,12 @@ For success responses, the following fields are valid.
}
```

If the successful response includes a `state` of `failed` then the Platform
MUST send a deprovision request to the Service Broker to prevent an orphan
being created on the Service Broker. However, while the platform will attempt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd capitalise Platform (multiple occurences in this PR) now to save conflicts when #356 is merged

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

spec.md Outdated
binding will remain in the marketplace database. Service brokers can include a
user-facing message in the `description` field; for details see [Service
Broker Errors](#service-broker-errors).
binding MUST remain in the marketplace database.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the first time we've used "marketplace database"? Can we rephrase this to something like:

Responses with any other status code will be interpreted as a failure and the Platform MUST continue to remember the Service Binding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not the only time we talk about DBs for the marketplace. Let me see if I can sneak in that change....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, let me do that in a new PR since its more than just a one word change and might require some thinking by the reviewers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: #362

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! :)

@mattmcneeney
Copy link
Contributor

mattmcneeney commented Nov 7, 2017

LGTM

Approved with PullApprove

@angarg12
Copy link
Contributor

angarg12 commented Nov 7, 2017

@duglin this PR needs rebase now. I will give my LG after it :)

@duglin
Copy link
Contributor Author

duglin commented Nov 8, 2017

rebased

@angarg12
Copy link
Contributor

angarg12 commented Nov 8, 2017

LGTM

Approved with PullApprove

@duglin
Copy link
Contributor Author

duglin commented Nov 14, 2017

I know there was some concern about this text:

However, while the Platform will attempt
to send a deprovision request, Service Brokers MAY automatically delete
any resources associated with the failed bind request on their own.

so, if those concerns are still there would using this text (someplace in the spec) help?

When the creation of a resource fails, or when a resource is deleted, the Service Broker
SHOULD eventually clean up all artifacts related to that resource. The exact timing of when
that occurs is an implementation detail. However, from the perspective of the OSB APIs
the clean up SHOULD be immediate.

IMO this leaves it open for the resource to live on for a while if the broker has a good reason
too - but per the OSB API it really needs to appear like it was deleted, otherwise the API is
just lying to the platform.

Thoughts?

@mattmcneeney
Copy link
Contributor

Regarding this text:

However, while the Platform will attempt
to send a deprovision request, Service Brokers MAY automatically delete
any resources associated with the failed bind request on their own.

As a platform author, my interpretation of that is:

  • I should send a deprovision request when something fails
  • But in reality, that deprovision request may well fail as the broker may have already done some cleanup
  • I'm fairly confident that this means that the cleanup has completed, but if that deprovision request I make fails, I should probably check that it failed because the cleanup has already happened, not because something has gone wrong

@duglin
Copy link
Contributor Author

duglin commented Nov 14, 2017

I'm ok with that interpretation :-)
I would think most platforms would treat a 404/410 from a DELETE as a good thing and move on.

@duglin duglin force-pushed the cleanOnFail branch 2 times, most recently from c9b8178 to 53193b6 Compare November 21, 2017 14:22
spec.md Outdated
@@ -962,7 +966,12 @@ $ curl http://username:password@service-broker-url/v2/service_instances/:instanc
| 400 Bad Request | MUST be returned if the request is malformed or missing mandatory data. |
| 422 Unprocessable entity | MUST be returned if the requested change is not supported or if the request cannot currently be fulfilled due to the state of the Service Instance (e.g. Service Instance utilization is over the quota of the requested plan). Service Brokers SHOULD include a user-facing message in the body; for details see [Service Broker Errors](#service-broker-errors). Additionally, a `422 Unprocessable Entity` can also be returned if the Service Broker only supports asynchronous update for the requested plan and the request did not include `?accepts_incomplete=true`; in this case the expected response body is: `{ "error": "AsyncRequired", "description": "This Service Plan requires client support for asynchronous service operations." }` (see [Service Broker Errors](#service-broker-errors). |

Responses with any other status code will be interpreted as a failure. Service Brokers can include a user-facing message in the `description` field; for details see [Service Broker Errors](#service-broker-errors).
Responses with any other status code will be interpreted as a failure and
the Service Broker MUST NOT apply any of the requested changes to the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the service broker deal with a long running, multi step update that fails near the end? It had to modify things to keep track of state in the migration, but was unable to complete the update. Do we require that the broker rolls all the steps back?

Perhaps this MUST NOT should be a SHOULD NOT ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with SHOULD NOT is that it then becomes indeterminate and we can't know what state the broker left things. Which, sort of, means we can't use the instance upon failure and will be forced to delete it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a concern about the case where the broker returns a 500, as the resource could have been put into an unknown state. The same goes for timeout scenarios.

For 4xx I agree that we should state MUST NOT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per our last call, I made the MUST NOT linked to a 4xx status code. I'll open a new issue to discuss if we can say anything about 5xx's.

@n3wscott
Copy link
Contributor

n3wscott commented Jan 11, 2018

LGTM

Note this PR will conflict with #409.

Approved with PullApprove

@angarg12
Copy link
Contributor

Which one do you think will be easier to merge? This one last?

@duglin
Copy link
Contributor Author

duglin commented Jan 16, 2018

I'm ok with #409 going first since it has more edits

@duglin
Copy link
Contributor Author

duglin commented Jan 19, 2018

rebased

@angarg12
Copy link
Contributor

Travis is failing

tools/../spec.md - 726: should only have a single space after a period

@duglin the check is working as intended 😄

Provide more guidance for what should happen when things go wrong
so we can ensure a more consistent semantics and interop.

This should only be reviewed after openservicebrokerapi#353

Signed-off-by: Doug Davis <dug@us.ibm.com>
@duglin
Copy link
Contributor Author

duglin commented Jan 22, 2018

:-)
fixed

@duglin
Copy link
Contributor Author

duglin commented Jan 23, 2018

reviews needed

@mattmcneeney
Copy link
Contributor

mattmcneeney commented Jan 23, 2018

LGTM

Approved with PullApprove

@avade
Copy link
Contributor

avade commented Jan 24, 2018

Happy with these changes.

LGTM

Approved with PullApprove

@duglin
Copy link
Contributor Author

duglin commented Jan 25, 2018

2 more needed. ping @fmui @angarg12

@duglin
Copy link
Contributor Author

duglin commented Jan 29, 2018

2 more needed

@fmui
Copy link
Member

fmui commented Jan 30, 2018

LGTM

Approved with PullApprove

1 similar comment
@vaikas
Copy link
Contributor

vaikas commented Jan 30, 2018

LGTM

Approved with PullApprove

@vaikas vaikas merged commit 076f4d3 into openservicebrokerapi:master Jan 30, 2018
@duglin duglin deleted the cleanOnFail branch March 2, 2018 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants