Status monitoring #116

jurgenhaas · 2017-01-20T15:20:50Z

I was wondering what's the best way to monitor the status of matterbridge. The docker container is one thing to look at but I'd also be interested about the status of the services that matterbridge is configured for.

Is it feasible to get some sort of an end-point in matterbridge that responds with a json summary of the current status which could then be used by monitoring tools?

42wim · 2017-02-19T20:56:31Z

I've been thinking about this.

Almost all of the libraries I use reconnect automatically on failures and most of them can't be asked about the current state. So this would only be possible to implement in my own (mattermost) library.

In the last release, irc and mattermost reconnects should be more stable. So services should always be connected or being busy trying to connect.

So unless every underlying library add a way to ask about their state, this isn't feasible.

jurgenhaas · 2017-02-20T08:46:38Z

Understood, this is not a simple one but I doubt that the other services would come up with status APIs without letting them know about the requirement.

What I've done in the meantime is to use the fluentd log driver of docker and now monitor the logs instead. I do get an error like this when Mattermost goes down:

time="2017-02-20T08:38:14Z" level=error msg="error:websocket: close 1006 (abnormal closure): unexpected EOF" module=matterclient

This is great and helps me to identify downtime. What's missing however is a log entry when Mattermost comes back and matterbridge reconnects. Would you mind putting something into the log even when not in debug mode?

Infernoman · 2017-05-06T13:25:27Z

Hey just wanted to post an update on this as well. Let me know if i should open a new issue. matterbridge has been crashing on me recently. and I've been looking into ways to monitor the application. I tried supervisor. Which i had no luck with. And found that forever.js isn't supported either. So I looked into docker. and had some issues as well.

I'm using the same configuration file for running it inside the go/bin folder but when i run it inside docker messages from mattermost don't get relayed to slack/irc/discord. but messages from slack/irc/discord get relayed to mattermost just fine.

When matterbridge is run by itself. messages relay between all 4 of the applications correctly.

42wim · 2017-05-06T14:19:35Z

@Infernoman Please reopen a new issue if you experience crashes. I'll try to fix those :)
Be sure you're running the latest version and put the stack trace after a panic in the issue.

For restarting systemd is a good option, you could do something like this example service below.

[Service]
ExecStart=/opt/matterbridge/matterbridge-linux64 -conf /opt/matterbridge/matterbridge.conf  
Restart=always  
StandardOutput=syslog  
StandardError=syslog  
SyslogIdentifier=matterbridge  
User=matterbridge  
Group=matterbridge

[Install]
WantedBy=multi-user.target

oxr463 · 2020-03-09T12:54:43Z

It would be nice if we could integrate prometheus or the like for this.

elberfeld · 2021-05-12T05:06:06Z

@Infernoman Please reopen a new issue if you experience crashes. I'll try to fix those :)
Be sure you're running the latest version and put the stack trace after a panic in the issue.

The benefit of a monitoring possibility would be grater than simple crashes.
If only one connection of multiple (in my case 3, IRC, telegram, Rocket.Chat) breaks there is noc way to detect this in an automated way.

If the Service crashes or ends itself systemd oder docker can restart the service.
But in the described case the service continues ti run and only one connection is dead.

Currently the only way to detect tis is to monitor the log or realize the absence of messages.

Perhaps something like a prometheus metrics endpoint could be a good start ?
https://github.com/prometheus/client_golang

42wim added the enhancement New feature or request label Apr 5, 2017

42wim mentioned this issue May 25, 2017

Matterbridge as a linux systemd service #176

Closed

42wim mentioned this issue Jul 13, 2017

Crash while connecting and SASL authentication ? #214

Closed

42wim mentioned this issue Apr 12, 2018

[Feature] Option to msg every channel with a "connection going down" msg on eror #403

Closed

42wim mentioned this issue May 7, 2018

Better handling when things go wrong #406

Open

qaisjp mentioned this issue Jan 1, 2019

IRC puppeting #667

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Status monitoring #116

Status monitoring #116

jurgenhaas commented Jan 20, 2017

42wim commented Feb 19, 2017

jurgenhaas commented Feb 20, 2017

Infernoman commented May 6, 2017

42wim commented May 6, 2017

oxr463 commented Mar 9, 2020

elberfeld commented May 12, 2021

Status monitoring #116

Status monitoring #116

Comments

jurgenhaas commented Jan 20, 2017

42wim commented Feb 19, 2017

jurgenhaas commented Feb 20, 2017

Infernoman commented May 6, 2017

42wim commented May 6, 2017

oxr463 commented Mar 9, 2020

elberfeld commented May 12, 2021