-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add SLR indicator header #65
Comments
@johnsimons please go ahead with this one |
@dannycohen - what is the associated issue in SP? |
This is currently a bug affecting existing functionality in SI (Particular/ServiceInsight#28) There are plans for a post V1 SP indicator for SLR requests, but its a low prio in the backlog. Why ? |
I thought we came to the conclusion that we can't supply this information On Wednesday, November 6, 2013, Danny Cohen wrote:
Regards |
I was under the impression the number of SLR's is wiped, but that we can check whether or nor SLR's occurred (i.e. a boolean parameter).
The fact that retries occurr repeatedly may indicate the system is somewhat unstable. A growing trend of retries is similarly significant to Opie and team. |
Correct, we can provide a boolean parameter On Wed, Nov 6, 2013 at 2:20 PM, Danny Cohen notifications@github.comwrote:
|
@dannycohen @andreasohlund - After a discussion with @johnsimons and @SimonCropp, rather than add a header which is a boolean to say if SLR was invoked it makes more sense to add a header which provides the total number of retries that occured, and this information will be available on messages that were successfully processed. So, by looking at the audit message and looking at the number of retries, we can identify problems. This count will be the net total (of FLR + SLR) that occured. Will that satisfy the requirement? |
This will allow the users to sort by this column in descending order and see which messages get retried most :) |
@indualagarsamy / @johnsimons / @andreasohlund -
Let me know. |
What we are proposing is for us to have a new header that will contain the number of times that message was retried. |
There is such a header and it is actively removed in order to fix a bug (see Particular/ServiceInsight#20 (comment)). Are you saying we can un-remove it ? |
No, I'm proposing a completely new header :) On 13 November 2013 18:38, Danny Cohen notifications@github.com wrote:
Regards |
@johnsimons - Whatever makes you happy... :-) |
@indualagarsamy / @andreasohlund / @johnsimons - guys, where are we on this ?
// CC @HEskandari |
This turned out to be nontrivial, can we bump this? On Fri, Dec 13, 2013 at 8:26 AM, Danny Cohen notifications@github.comwrote:
|
@indualagarsamy https://github.com/indualagarsamy can you add the details On Fri, Dec 13, 2013 at 8:42 AM, Andreas Öhlund <
|
NP, but it would be good to record the diagnosis and the prognosis. |
Indu has the details In short: We need the new pipeline hooks in NServiceBus 4.4.0 to pull this On Fri, Dec 13, 2013 at 8:48 AM, Danny Cohen notifications@github.comwrote:
|
I'm fine with that. |
What was the issue @indualagarsamy ? |
Trying to add the header, we realized that there is no easy way to count the FLRs the way it stands now. |
What is wrong with adding it in https://github.com/Particular/NServiceBus/blob/develop/src/NServiceBus.Core/Unicast/Transport/FirstLevelRetries.cs#L49 ? |
Assigning to backlog, pending SC / SP roadmap alignment |
It can be reopen if we decide to do it |
There is another stakeholder here. As a salesperson for Particular Software, when talking to other departments at our larger customers, I would like to be able to tell them how many messages NServiceBus not only prevented from failing but handled those failures automatically without any manual intervention thus clarifying the value proposition versus the team using HTTP or rolling their own solution on top of a queue. |
@udidahan - how would you prioritize it ? (post LM3, of course) |
That's a question to be answered via the Trello board, taking into account all the other things we have for LM4. |
I think the way to go here is that instead of a header we should create a plugin that hooks into the notifications, http://docs.particular.net/nservicebus/errors/subscribing-to-push-based-error-notifications. This way we can let the users set threshold for flr/slr per second and have the plugin notify SC when that happens? Arguable this could be implemented and distributed as a custom check? We could start by adding this as a sample to gauge interest? |
@Particular/servicecontrol-maintainers |
@johnsimons if u close it how would distinguish it from all the other issues that are closed "as fixed"? |
@SimonCropp no milestone associated |
@johnsimons so u regularly go through the 268 closed issues with no milestone as part of your backlog pruning? https://github.com/Particular/ServiceControl/issues?q=is%3Aissue+no%3Amilestone+is%3Aclosed |
Not sure what you are talking about @SimonCropp. |
I have a demo going that shows how to do this without a header and instead using a custom check to "alert" in SP when SLR rates go up. How about I get that recorded and we close this one as won't fix? |
Still agree with @SimonCropp that we shouln't just close issues that we still might act on. Perhaps move it some where or add a bullet point to some more generic "crazy SC ideas" issue. How about we discuss this in plat dev since this is relevant for all our repos? |
Agree. |
In my view, closing a feature request does not imply that we will not be acting on, it just means that right now the focus is elsewhere. It just help us concentrate on issues/features that are currently important. Having this massive backlog of open issues IMO makes it difficult to figure out where we going. The way I see it, is open feature requests that we do not intend to work on (right now), are just clutering the system, and also sending the wrong message to user out there, the fact is that we have no idea when we are going to tackle any of these or even if we ever going to do it at all.
@andreasohlund that would be great, looking forward to it. |
As @andreasohlund said let's have the discussion about do we or don't we close issues in platform dev. |
yeah, let's discuss it. I don't think we should close issues, a closed issue is a lost one, IMO. If a huge backlog is putting pressure on us it's a tolling problem (and to be honest GitHUb issues suck) since we have no way (other than the waffle boards, that are not that much better) to prioritize and look only at the top N issues. Also given that we have no milestones set in stone priorities may change overtime and is very easy to lose track of closed issues that were closed because of lack of capacity or out of scope at that time. |
@mauroservienti it is quite simple to keep track of feature requests that were closed. |
This is currently not prioritised to be done any time soon, I will close this one for now and if we ever go down this path we can reopen it |
As Opie, I would like to be notified when there are many or increasing number of SLRs performed, so I can investigate further if this is an indication of system instability (and I will be able to take corrective actions before it deteriorates further).
Implementation proposal:
SecondLevelRetriesPerformed : true|{false}
Note:
In order to conserve header space, avoid sending the header when SLR were not performed (i.e. the very existence of the header is an indication of it being true; the value of false is optional).
The text was updated successfully, but these errors were encountered: