Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add journalparser plugin #2569

Closed
wants to merge 1 commit into from
Closed

add journalparser plugin #2569

wants to merge 1 commit into from

Conversation

phemmer
Copy link
Contributor

@phemmer phemmer commented Mar 24, 2017

Adds a journalparser plugin. The plugin tries to be as similar to configuration of the logparser plugin as possible, and shares the grok parser code.

This plugin does have the unfortunate result of breaking cross-compilation. This is because it uses CGO for the systemd journal libs. You can get cross compilation working again, but you need a cross-compilation toolchain on the build host.

Required for all PRs:

  • CHANGELOG.md updated (we recommend not updating this until the PR has been approved by a maintainer)
  • Sign CLA (if not already signed)
  • README.md updated (if adding a new plugin)

Closes #2109

@@ -281,7 +278,7 @@ func (p *Parser) ParseLine(line string) (telegraf.Metric, error) {
}
}

return metric.New(p.Measurement, tags, fields, p.tsModder.tsMod(timestamp))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is the reason for all the grok and logparser changes.
metric.New() will return error if there are no fields present. In the journalparser plugin this may happen as the user may have configured the plugin to parse multiple journal fields. The journalparser plugin calls ParseLine for each journal field and then merges all the results together. And so one of the calls might not extract any fields.

@phemmer
Copy link
Contributor Author

phemmer commented Mar 24, 2017

Looks like build is failing because it needs the systemd development headers. I'm not familiar with CircleCI and how to make sure these are installed.

@danielnelson
Copy link
Contributor

AFAIK In the past we haven't accepted plugins that require C libraries, perhaps we can add this to the extras plugin repo once it is ready.

@phemmer
Copy link
Contributor Author

phemmer commented Mar 24, 2017

If the requirement is just not having external libraries so that users don't have to install them to compile telegraf, what about including the header file in the telegraf repo? Since the libs are dynamically loaded, they don't need to be present when building, or even running (as long as you don't try to use a plugin which requires them).

@danielnelson
Copy link
Contributor

I don't want to copy in code from other projects, and it really ought to use the machines systemd headers. I wonder if there is a way we can make this plugin optional without making our build system totally non standard. In my mind the ideal setup would be that if you have the header then it builds and otherwise it doesn't.

@phemmer
Copy link
Contributor Author

phemmer commented Mar 25, 2017

Well to play devils advocate, copying code is kinda what the go vendoring thing is about.
Systemd also offers an API stability promise that they won't break things: https://www.freedesktop.org/wiki/Software/systemd/InterfacePortabilityAndStabilityChart/ (this pr uses "sd-journal.h API"). So using an out-of-date header file should cause no harm.

An alternative would be to do this: https://godoc.org/github.com/rainycape/dl#example-Open--Snprintf. We wouldn't need the header files at all this way.

As for a way to make the plugin optional. You can accomplish anything with go generate. You could set it up so that go generate will enable the plugin for building if the headers were found. You could also do build flags, such that it's on or off by default, and then a build flag to flip it the other way.

@nhaugo nhaugo added this to the Future Milestone milestone Mar 31, 2017
@nhaugo
Copy link
Contributor

nhaugo commented Mar 31, 2017

1.3 will utilize modules from go 1.8, we will have to wait until then to get this in do to the dependency on external c libs.

@phemmer
Copy link
Contributor Author

phemmer commented Apr 30, 2017

@nhaugo I thought that was on hold. #2373 (comment)

Anyway, I just pushed an alternate implementation for consideration. It uses the journalctl command and parses the output. I didn't do this initially as I'm not fond of spawning external processes. However the implementation I put in keeps this to a minimum (1 process in all but the rarest of use cases). It also lets the external journalctl process do the heavy filtering, instead of doing it in telegraf.

So in a more descriptive nutshell, what it does:

  • Forks of a single journalctl with a disjunction (OR list) of all the match sets. Meaning if one instance of [[inputs.journalparser]] has matches = ["_COMM=httpd","PRIORITY=6"] and another instance has ["_COMM=nginx","PRIORITY=6"], then journalctl will look for (_COMM=httpd PRIORITY=6) OR (_COMM=nginx PRIORITY=6).
  • Every time the match list is updated, a new journalctl with updated args is spawned, and we kill off the old one.
  • Since we only have a single stream for all journal events, we then have to filter again to send them to the right journalparser instance, but this shouldn't be that bad since journalctl already has done the heavy filtering for us.
    Also because of this, each journal entry is only parsed once. In the previous implementation, if the same journal entry was matched by multiple journalparser instances, we'd parse it multiple times. The previous implementation was also opening up the journal for every journalparser instance, so even though filtering was done in the C code, it was still filtering the same data multiple times.
  • The rare use case where multiple journalctl processes are spawned is if the path property is changed. We launch one journalctl process per path value.

This has barely been tested (though it does have a test suite which is passing). And I think I'd like to clean up the unit tests a bit. But it should be functional.

@danielnelson
Copy link
Contributor

I would be okay with the version requiring the headers so long as they are not needed by default, and we add something to the build to make sure we build support for it. Maybe we just add a --list-inputs style option so we can check that everything was enabled.

Automatically enabling features would be cool but I don't want to roll my own solution for this. Plugin thing is paused, probably will try to do a gRPC solution.

The subprocess version sounds fine as well. Which implementation do you prefer?

@phemmer
Copy link
Contributor Author

phemmer commented May 20, 2017

I'm not sure which I prefer.
External process is nice in that you don't have compile time build features you have to enable (though I would argue that this is really an artificial limitation of the coreos/go-systemd implementation).
But on the other hand not relying on external utilities is nice. And the code for using journalctl is significantly more complex.

@danielnelson danielnelson removed this from the Future Milestone milestone Jun 14, 2017
@danielnelson danielnelson added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Aug 24, 2017
@DSpeichert
Copy link
Contributor

@phemmer Any chance of resurrecting this? I'd love to have this input merge and I can help if needed.

@danielnelson
Copy link
Contributor

I think we should probably close this issue out for now. We haven't gotten any closer to solving the issue of optional C library dependencies since this issue was last updated and no real tooling has emerged that could help. Would still be willing to accept a native Go solution however I'm not sure journald provides an API outside of the C library.

It is also now possible to forward syslog messages into Telegraf in a performant and reliable way, and I think this is a very good method for handling log messages both when using journald.

@phemmer
Copy link
Contributor Author

phemmer commented Aug 28, 2019 via email

@danielnelson
Copy link
Contributor

Thanks for the clarification, that's right we had a CGO implementation but this is an fork/exec version of it.

@DSpeichert I would still recommend trying the syslog plugin to see if it will work for you first, but if you are still interested in finishing up this plugin let me know, there has been some changes in Telegraf that would affect how we should go about it.

@gdamjan
Copy link

gdamjan commented Aug 28, 2019

syslog transport would loose the structured data that's in the journal

@DSpeichert
Copy link
Contributor

@danielnelson I explored the option of forwarding journald log to telegraf via syslog protocol.
I'm sorry for hijacking this issue into a question about that feature, but based on journald documentation:

With the first method, messages are immediately forwarded to a socket (/run/systemd/journal/syslog), where the traditional syslog daemon can read them.

I don't think that the telegraf plugin supports that, it seems to list support for TCP, UDP and TLS only.

It isn't any harder in Go to listen on a Unix socket, but I haven't tried this and I'm not sure if journald is creating that socket or expecting e.g. telegraf to create that. It also requires telegraf to start early in the boot process and immediately read everything or risk losing it, vs a tailing approach that can catch up for a longer time.

@glinton
Copy link
Contributor

glinton commented Aug 28, 2019

it seems to list support for TCP, UDP and TLS only

That must have been a documentation oversight, as unix sockets are supported by the syslog input.

@danielnelson
Copy link
Contributor

I'll update the documentation for that. Right now the chain needs to be journald -> rsyslog -> syslog input. In order to take rsyslog out of the pipeline we would need support for RFC 3164 messages (#4593).

syslog transport would loose the structured data that's in the journal

This is true, you could try using imjournal in rsyslog to bring this to over. If we wanted to take rsyslog out and have structured data, then I believe we are essentially back to this PR.

@dokterbob
Copy link

Hey, why is this closed?

@dokterbob
Copy link

In other words; what is required to get this code in?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin new plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature request] add input plugin to tail systemd journal
8 participants