Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status monitoring #116

Open
jurgenhaas opened this issue Jan 20, 2017 · 6 comments
Open

Status monitoring #116

jurgenhaas opened this issue Jan 20, 2017 · 6 comments
Labels
enhancement New feature or request

Comments

@jurgenhaas
Copy link

I was wondering what's the best way to monitor the status of matterbridge. The docker container is one thing to look at but I'd also be interested about the status of the services that matterbridge is configured for.

Is it feasible to get some sort of an end-point in matterbridge that responds with a json summary of the current status which could then be used by monitoring tools?

@42wim
Copy link
Owner

42wim commented Feb 19, 2017

I've been thinking about this.

Almost all of the libraries I use reconnect automatically on failures and most of them can't be asked about the current state. So this would only be possible to implement in my own (mattermost) library.

In the last release, irc and mattermost reconnects should be more stable. So services should always be connected or being busy trying to connect.

So unless every underlying library add a way to ask about their state, this isn't feasible.

@jurgenhaas
Copy link
Author

Understood, this is not a simple one but I doubt that the other services would come up with status APIs without letting them know about the requirement.

What I've done in the meantime is to use the fluentd log driver of docker and now monitor the logs instead. I do get an error like this when Mattermost goes down:

time="2017-02-20T08:38:14Z" level=error msg="error:websocket: close 1006 (abnormal closure): unexpected EOF" module=matterclient

This is great and helps me to identify downtime. What's missing however is a log entry when Mattermost comes back and matterbridge reconnects. Would you mind putting something into the log even when not in debug mode?

@42wim 42wim added the enhancement New feature or request label Apr 5, 2017
@Infernoman
Copy link

Hey just wanted to post an update on this as well. Let me know if i should open a new issue. matterbridge has been crashing on me recently. and I've been looking into ways to monitor the application. I tried supervisor. Which i had no luck with. And found that forever.js isn't supported either. So I looked into docker. and had some issues as well.

I'm using the same configuration file for running it inside the go/bin folder but when i run it inside docker messages from mattermost don't get relayed to slack/irc/discord. but messages from slack/irc/discord get relayed to mattermost just fine.

When matterbridge is run by itself. messages relay between all 4 of the applications correctly.

@42wim
Copy link
Owner

42wim commented May 6, 2017

@Infernoman Please reopen a new issue if you experience crashes. I'll try to fix those :)
Be sure you're running the latest version and put the stack trace after a panic in the issue.

For restarting systemd is a good option, you could do something like this example service below.

[Service]
ExecStart=/opt/matterbridge/matterbridge-linux64 -conf /opt/matterbridge/matterbridge.conf  
Restart=always  
StandardOutput=syslog  
StandardError=syslog  
SyslogIdentifier=matterbridge  
User=matterbridge  
Group=matterbridge

[Install]
WantedBy=multi-user.target  

@oxr463
Copy link

oxr463 commented Mar 9, 2020

It would be nice if we could integrate prometheus or the like for this.

@elberfeld
Copy link

@Infernoman Please reopen a new issue if you experience crashes. I'll try to fix those :)
Be sure you're running the latest version and put the stack trace after a panic in the issue.

The benefit of a monitoring possibility would be grater than simple crashes.
If only one connection of multiple (in my case 3, IRC, telegram, Rocket.Chat) breaks there is noc way to detect this in an automated way.

If the Service crashes or ends itself systemd oder docker can restart the service.
But in the described case the service continues ti run and only one connection is dead.

Currently the only way to detect tis is to monitor the log or realize the absence of messages.

Perhaps something like a prometheus metrics endpoint could be a good start ?
https://github.com/prometheus/client_golang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants