Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in broker module webui #128

Closed
andyxning opened this issue Apr 28, 2015 · 6 comments
Closed

Memory leak in broker module webui #128

andyxning opened this issue Apr 28, 2015 · 6 comments

Comments

@andyxning
Copy link

We use WebUI as the frontend of Shinken, however, after about a month later, we find that the memory of WebUI is holding almost 30G memory.

We do not restart Broker daemon meantime. We only do some reload of arbiter if the configuration has changed. So, any threads about how this comes?

I have issued the same question in shinken 1597.

@mohierf
Copy link
Contributor

mohierf commented May 7, 2015

I met the same problem this morning ! I do confirm that the master branch WebUI seems to have heavy memory leaks ! Still on survey on my production ... more than 19Gb memory in few minutes for 20k services.

I will make a test with BS3 branch on the same environment to check.

@andyxning
Copy link
Author

As far as i can tell is that the notificationways is duplicated between each shinken reload.
I use webui to log the number of each field of a regenerator instance. Then i find that between each shinken reload the notificationways is duplicated. So, if we have many(3w, acutally) contacts, then we can have as many notificationways as contacts, so the memory will be added up.

@gst
Copy link

gst commented May 9, 2015

As far as i can tell is that the notificationways is duplicated between each shinken reload.

ah ! that would explain why I don't always (or at all) succeed to reproduce such issues : because often I use a relatively simple config, if in which there is no notificationways then I could simply don't hit the "bug".

I use webui to log the number of each field of a regenerator instance. Then i find that between each shinken reload the notificationways is duplicated. So, if we have many(3w, acutally) contacts, then we can have as many notificationways as contacts, so the memory will be added up.

So thanks for the info/tip when I'll retry an investigation with notificationways enabled :)

@andyxning
Copy link
Author

i can now tell that there is something wrong when processing contacts and notificationways in regenerator.py.

We only clear hosts and services when we reload the configuration or when one scheduler has down and the spare one takes over the configuration, however, we lost the contacts and notificationways.

we initial new notificationways wrongly after the add_item function which will then use notificationway_name as the key to insert the new notificationways. However, in this way we get the notificationway_name with None and all the notificaionways will be over write and only the last one will survice.
So, the self.notificationways.find_by_name(nwname) will always return False. and we will always add notificatinway to self.notificationways.

The solution to this problem is very simple, we can just add and flag, indicating whether the notificationway is not, and then initialize(the first time with a service shinken restart or update the notificaway. After that we can append the new notificationway when it is a new one according to the flag. This is to make sure that when we append a new notificationway we can have the notificationway_name with it. Thus, we can fix this one.

for cnw in nws:
            nwname = cnw.notificationway_name
            nw = self.notificationways.find_by_name(nwname)
            if not nw:
                safe_print("Creating notif way", nwname)
                nw = NotificationWay([])
                self.notificationways.add_item(nw)
            # Now update it
            for prop in NotificationWay.properties:
                if hasattr(cnw, prop):
                    setattr(nw, prop, getattr(cnw, prop))
            new_notifways.append(nw)

In order to further make the contacts and notificationways update with each shinken reload or scheduler switch(with one has down) we should update the contacts and notificationways info according to services and hosts.

I think i can make a PR later.

@mohierf
Copy link
Contributor

mohierf commented Jun 26, 2015

BS3 branch do not suffer from memory leaks ... I close this issue.

@mohierf mohierf closed this as completed Jun 26, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants