-
Notifications
You must be signed in to change notification settings - Fork 491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JoinNode ignores Delete BarrierNode messages. #2436
Comments
Are you using the |
@docmerlin, I am using the .delete(TRUE). However, the join() cardinality keeps increasing nonetheless |
From my testing, it looks like join() is ignoring barrier messages? |
@m4ce |
|
I even had to add a barrier after join() in order to reduce the cardinality on the other nodes. I can see the cardinality decreasing before and after join. But the join itself keeps increasing! |
Ok, I've been able to confirm that join node is ignoring barrier node delete messages. |
The NewMultiConsumerWithStats isn't handling the DeleteGroupMessage. |
A tenative plan is to see if it is easier to add DeleteGroupMessage handling to MultiConsumerWithStats or to create a NewMultiConsumerWithStatsAndDelete that can handle the DeleteGroupMessage. |
Tentatively adding this to the 1.5.8 milestone, because OOMS. |
@docmerlin, we just had another OOM due to this. Do you think you guys will be able to get it in for this or the next release? Thanks. |
I'm also impacted by this. It looks like the 1.5.8 milestone was removed from this, any idea when, if not 1.5.8 this might be resolved? |
@docmerlin, we are experiencing frequent OOM due to this. Can you pls let us know the next milestone this will be added to? |
This definitely will be in 1.5.9. Sorry, but this wasn't quite done and we
needed 1.5.8 out to fix problems for other customers.
…On Tue, Feb 2, 2021 at 6:22 AM VJ ***@***.***> wrote:
@docmerlin <https://github.com/docmerlin>, we are experiencing frequent
OOM due to this. Can you pls let us know the next milestone this will be
added to?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2436 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJN6W7CQGX4DQMUUQBZS33S47VBRANCNFSM4UGBF7GQ>
.
|
Looks like it's not planned for 1.5.9? Will it manage to make it in 1.6.0..? |
@m4ce We found a honest to goodness old fashioned memory leak in the join node so we fixed that instead. That will likely fix most people's problems. |
1 similar comment
@m4ce We found a honest to goodness old fashioned memory leak in the join node so we fixed that instead. That will likely fix most people's problems. |
But didn't you say that the join node ignores barrier delete node messages? How is that going to fix this then? |
@m4ce said:
The old join node kept a copy of every point that passed through it because of a memory leak, even points it was done joining. The fix removes the leak so memory usage should be vastly improved. |
@docmerlin - Okay, let's see how the new release performs before closing this issue. |
@docmerlin this still persists in Just curious, earlier mentioned "old fashioned memory leak" planned for which milestone? Thanks. |
@docmerlin - the issue here is that we have a high cardinality in our join node which will not decrease unless the barrier messages are supported so that groups can be expired. I am sure the bugs you fixed will help in several cases but not here I am afraid. Unless the barrier messages are handled, this won't solve the issue outlined in this ticket. |
@docmerlin any updates pls? |
@vj-gsr sorry man, I am currently working on other kapacitor tasks. I can say that this is about 80% done, however. I'm trying really hard to squeeze it into 1.6.0, but can't promise anything yet. Edit: I've been told we are holding 1.6.0 till we can get it in, to make sure we get it in. |
OK many thanks @docmerlin, any tentative date for |
@docmerlin did you get a chance to work on this? any update pls? |
tickscript was working with 1.5.9 - trying 1.6.0-rc2, failing. It seems there's something wrong in handling barrier messages in the join node? |
I actually see issues also with tickscripts that didn't use the join node. I effectively see the problem with any tickscript that uses the barrier node. I think we have a regression here. |
@m4ce @docmerlin I have a fix for
This PR #2585 takes care of it , I don't see the error message with this fix , I also noticed no mem leaks either. |
@docmerlin can you pls confirm this has been sorted in |
@docmerlin the memory leak is still present in |
I'm going to go ahead and close this issue. As it was closed by #2562 @vj-gsr I believe your leak is coming from a different source than this issue. |
After some testing, I found out that the JoinNode cardinality doesn't decrease when a BarrierMessage is emitted for a group that should expire. This effectively leads the JoinNode's cardinality to increase forever, leading to a memory leak.
The text was updated successfully, but these errors were encountered: