Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Board Review: Service Bus #1884

Closed
ramya-rao-a opened this issue Oct 2, 2020 · 8 comments
Closed

Board Review: Service Bus #1884

ramya-rao-a opened this issue Oct 2, 2020 · 8 comments
Assignees
Labels
architecture board-review Request for an Architectural Board Review

Comments

@ramya-rao-a
Copy link
Contributor

ramya-rao-a commented Oct 2, 2020

Note: This is the request for API review for Service Bus before we go GA in November, 2020.
Previous architecture board discussions on the subject:

Discussions with the service team after the above meetings have brought in a few more design updates that we will review as well.

The Basics

Artifacts required (per language)

.Net

https://apiview.dev/Assemblies/Review/b5d7e3304000449c859c2016f718db52

Java

Python

TypeScript

Champion Scenarios

Those from previous discussions still apply.
See #1046

Agenda for the review

A board review is generally split into two parts, with additional meetings as required

Part 1 - Introducing the board to the service:

  • Review of the service (no more than 10 minutes).
  • Review of the champion scenarios.
  • Get feedback on the API patterns used in the champion scenarios.

After part 1, you may schedule additional meetings with architects to refine the API and work on implementation.

Part 2 - the "GA" meeting

  • Scheduled at least one week after the APIs have been uploaded for review.
  • Will go over controversial feedback from the line-by-line API review.
  • Exit meeting with concrete changes necessary to meet quality bar.

This request is for the Part 2 i.e. the "GA" meeting

@ramya-rao-a ramya-rao-a added architecture board-review Request for an Architectural Board Review labels Oct 2, 2020
@ramya-rao-a
Copy link
Contributor Author

@lilyjma Please schedule this meeting in 2 parts.

  • First to be an hour long on Oct 7, 11am - 12pm PST in order to accommodate the Germany time zone. We will focus on the Java API here. The service team has already reviewed the .Net one.
  • The other one should be 2 hours long to accommodate the other 3 languages. As discussed offline this can be in the second week of October

@lilyjma
Copy link
Contributor

lilyjma commented Oct 2, 2020

scheduled for 10/7 11am-12pm PST and 10/15 2-4pm PST

@ramya-rao-a
Copy link
Contributor Author

ramya-rao-a commented Oct 7, 2020

Notes from the Java review:

  • Jonathan:
    • API View does not have BinaryData for the message body, this will be added soon
    • It is the first time we introduce the concept of sub-builder
  • Shankar says that in Track 1, if you try to send same message object again, the client throws.
    • Feature crew will follow up on this to understand the reason behind it and see if the same should apply to message batch as well
  • Jonathan on batch:
    • getCount is not Javaesque, it is not clear that this indicates the number of messages in the batch.
    • getMessageCount as suggested by Shankar is more preferable
    • isEmpty() would be useful
  • Jonathan on sender
    • When getting started, the TransactionQueueOrTopicName conflates with the queueName and topicName setters.
      • This setter will soon be deprecated. Ramya to sync with Shankar on timings and see if there we can potentially remove this from Track 2
    • sendMessages(): batch vs iterable: why not have a convenience that get rids of one of the two
      • Ramya to share investigations done for Event Hubs in the same area

Receiver async sample in migration guide:

  • Jonathan: Can be simpler, why flatmap?
  • Shankar: will "subscribe" be confused with "subscription"?
  • Shankar: "receiveMessages()" does not really receive messages, but only adds message handlers, should we rename?
    • We expect that the fact that the method exists on async client and that async clients use reactor should be enough to resolve the confusion here
  • When does the stream end?
    • Hemant: After we exhaust the retry attempts as per the retry policy set by the user.
    • Ramya: User code error, message lock lost error should not stop the stream. All retryable errors should result in following the retry policy set by the user
    • Clemens: Nothing other than the receiver being closed should stop the stream, regardless of the retry policy set by the user. This is what Track 1 does as well
    • Ramya: Feature crew will take another look at what Track 1 does here and see how we can reconcile with this across all the 4 languages.
    • Shankar: Does stopping the subscription gracefully close things? On stopping, all in flight messages should be processable and settlable by the user, no new messages should be received.
    • Shankar: We should not allow calling receiveMessages() when another call is in progress.

Receiver sync sample in migration guide:

  • Clemens:

    • Feature in Track 1 which is most important is that you dont have to write receive loops because we gave them registerMessageHandler(). The loop mechanism is actually in the advanced class MessageReceiver so that people first find the registerMessageHandler on the QueueClient instead.
    • Non reactor users should not be forced to having loops as the only way to receive messages. They should have a callback option as well. Jonathan agrees. Need to follow up.
  • Srikanta: Need to check event hubs, errors thrown by user code should be surfaced in error handler

@lilyjma
Copy link
Contributor

lilyjma commented Oct 7, 2020

Java

Video [MS internal only)

@lilyjma
Copy link
Contributor

lilyjma commented Oct 15, 2020

.Net, Python, TS:

Video - .Net[MS internal only]

Video - Python, TS[MS internal only]

@ramya-rao-a
Copy link
Contributor Author

Notes for .Net API review on 10/15

Samples in Readme:

  • TryAdd() returns boolean, but sample does not make use of it. Have sample that would show how we expect users to use it. For example, assume an input array of messages and loop through it to safely batch and send
  • Showcase the batch method first as we want to encourage people using this over the other overload to ensure safe batching. Consider having a line telling the users of the availability of the other overload instead of a code snippet for it
  • The first few code snippets show the use of receivers to receive messages. If the processor is meant to be the getting started scenario, are we shooting ourselves in the foot by not showing that first?
  • Processor: Any alternatives for tsc.SetResult? Consider simplifying the sample or show how one would use it in a real application
  • Processor: For it to be used to continuously receive messages, it seems like we need a loop on the outside. Doesnt this defeat the purpose of the processor handling the looping for the user? Does this mean that starting the processor is not fire and forget?

API view

  • CloseAsync() on client: If we were to add this, it would do the exact same thing as DisposeAsync(). Since this can be added later without breaking changes, we will see if there is any feedback on the confusion around it missing.
  • IsClosed: Should this reflect anything other than user explicitly closing/disposing the client/sender/receiver/processor? How do we expect the user to use this? What is the best practice around it?
  • Auto Complete: This is not just about auto complete on success, but auto abandon on failure. Shouldnt the name reflect that? AutoResolveMessages, AutoSettleMessages
  • MaxConcurrentCalls: Why is the default 1?
  • On Auto complete/abandon failure, does the user need to know which message did it fail for?
  • Naming guidelines for options passed to methods: If the method is doing a service operation, then the guideline of using the method name + "Option" holds. If all we are doing is a create a client side construct, then not so much.
  • Custom text prepended to User Agent String: Follow up issue for SB & EH should be created. KC has concerns around if this solves anything for the user when they use multiple services and nest the usage. We will wait for resolution on this before adding the feature to SB & EH
  • JS & Java to ensure that the error handler gets namespace, entityPath and error source passed to it
  • Message suffix on the batch as a reaction to Ux studies where users were not clear that tryAdd should be what they should use. Since we are using the term "message" in all the apis on the sender and receiver, it is not a stretch to do the same for the batch related apis. JS & Java to follow suit. Python to consider which of the below they would like to take as they generally prefer terseness
    • TryAdd() vs TryAddMessage()
    • CreateBatchAsync() vs CreateMessageBatchAsync()
    • CreateBatchOptions vs CreateMessageBatchOptions
  • Regarding consistency between properties on CreateQueueOptions, we are not too keen on being consistent between the "Enable" and "Requires" prefixes as matching with what goes over the wire trumps the small win of being consistent with the prefixes. This would not hold true for cases where the name is way too bad in the first place, but we do not find any such instances for Service Bus at the moment.

@lilyjma
Copy link
Contributor

lilyjma commented Oct 16, 2020

Python and TS review scheduled for 10/22 2-4PM PST

@ramya-rao-a
Copy link
Contributor Author

Notes for Python API review:

Gist used for code snippets: https://gist.github.com/annatisch/506b98c3fcc4fc84ee3328d5a43fab1e

Recent changes to the API

  • The methods to get a receiver and session enabled receiver are unified, so is the resulting receiver class
  • The methods to settle messages and renew their locks are moved from the message to the receiver
  • Option to automatically renew locks (messages & sessions) is now available at the receiver level

Notes:

  • session_id takes NEXT_AVAILABLE_SESSION. This should be an object so that it is differentiable from a string which would mean that we are asking for a specific session
  • The workflow for get_queue_receiver() or get_subscription_receiver() is different when sessions are used. Since not able to get a lock on given session or timing out due to service having no unlocked sessions is normal use case, the error handling for these should be shown in samples
  • The need for end to end samples and extensive troubleshooting guide applies to all languages. Feature crew to follow up to ensure we are covered here
  • JS & Python do not have APIs for working with multiple sessions like .Net & Java do. They have the building blocks and a sample that shows how one can build a solution to work with multiple sessions
  • In case of sessions, the receiver.session.renew_lock() renews the lock on the sessions. The receiver.renew_lock(message) method was proposed to work the same way. This has push back from Richard, Harsha and Ramya as summarized below. Python team to revisit this decision
    • Having renew_lock(message) renew the session lock is misleading. Losing lock on message is a minor inconvenience as Service Bus is expected to provide a atleast once guarantee not atmost once. But losing on session is much more serious and requires users to recreate the receiver to attempt to get a lock on the session or decide to move on to the next session
    • Having renew_lock(message) renew the session lock can lead people to call it on every message which results in unnecessary renew locks on the session under the hood
  • While we now have auto lock renewal on the receiver level that will automatically renew the locks on the messages/session associated with the receiver, it is still a feature one needs to opt into. Other languages enable this by default. Python to revisit this offline
    • We have news from the service team that they will be moving away from the lock renewal being a client responsibility and will be enabling it by default. So, Python will be left with different defaults post this change
  • Python will stick to the get verb while .Net & JS will stick to the create verb for methods that return the senders and receivers. The main reason for create is to let users know that these method create a new link every time and they need to use them responsibly and doing the due clean up. The with model provides a good clean up story and this is not much of a concern in Python
  • getState() should return bytes, setState() can take the union of string and bytes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture board-review Request for an Architectural Board Review
Projects
None yet
Development

No branches or pull requests

4 participants