Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider allowing configuring journald receiver to turn on --all flag #30920

Closed
meridional opened this issue Jan 31, 2024 · 8 comments
Closed
Assignees
Labels
enhancement New feature or request receiver/journald

Comments

@meridional
Copy link
Contributor

meridional commented Jan 31, 2024

Component(s)

journald receiver.

Is your feature request related to a problem? Please describe.

Currently, journald receiver works by piping data from
journalctl -u <unit> -output=json ....

Without --all flag, journalctl will output a json object where .MESSAGE=null if the message has more than 4096 characters (see https://man7.org/linux/man-pages/man1/journalctl.1.html), effectively dropping the log line.

Describe the solution you'd like

Add a configuration to journald receiver to allow users to turn on the --all flag. This config defaults to be turned off.

Describe alternatives you've considered

Add --all flag unconditionally. This might have performance implications as indicated by journalctl documentation.

Additional context

No response

@meridional meridional added enhancement New feature or request needs triage New item requiring triage labels Jan 31, 2024
@meridional meridional changed the title Consider allowing configuration journald receiver to turn on --all flag Consider allowing configuring journald receiver to turn on --all flag Jan 31, 2024
Copy link
Contributor

Pinging code owners for receiver/journald: @sumo-drosiek @djaglowski. See Adding Labels via Comments if you do not have permissions to add labels yourself.

@djaglowski
Copy link
Member

Seems reasonable to me as long as it's optional and off by default. WDYT @sumo-drosiek?

@sumo-drosiek
Copy link
Member

Totally agree. This flag seems to be worth to add it.

@meridional Would you like to add it by yourself? I can do it if not :)

@djaglowski djaglowski removed the needs triage New item requiring triage label Jan 31, 2024
@meridional
Copy link
Contributor Author

Totally agree. This flag seems to be worth to add it.

@meridional Would you like to add it by yourself? I can do it if not :)

Yes, I can take a stab at it.

djaglowski pushed a commit that referenced this issue Feb 2, 2024
**Description:** 
This commit adds support for `all` to journald
receiver, which allows full output, including
unprintable characters and lines that are too long.

**Link to tracking Issue:** #30920 

**Testing:** unit test and manual test.

**Documentation:** receiver/journaldreceiver/README.md and
pkg/stanza/docs/operators/journald_input.md
@RobertFloor
Copy link

Dear All,

We want to forward log lines containing unprintable characters to our logging system and therefore would like to use this feature. Unfortunately, it did not work with the all: true option.

We found out that adding the option all: true to the receiver yields this command:

journalctl --utc --output=json --follow --priority info --directory /var/log/journal _SYSTEMD_USER_UNIT=target.service --all

This looks good but unfortunately the --all flag has a different meaning for journalctl in combination with --output=json.

man journactl

json
               formats entries as JSON objects, separated by newline characters (see Journal JSON Format[3] for more information). Field values are generally encoded as JSON strings, with three
               exceptions:

                1. Fields larger than 4096 bytes are encoded as null values. (This may be turned off by passing --all, but be aware that this may allocate overly long JSON objects.)

                2. Journal entries permit non-unique fields within the same log entry. JSON does not allow non-unique fields within objects. Due to this, if a non-unique field is encountered a JSON
                   array is used as field value, listing all field values as elements.

                3. Fields containing non-printable or non-UTF8 bytes are encoded as arrays containing the raw bytes individually formatted as unsigned numbers.

So the option journalctl --all --output=json has effect 1, and the unprintable characters are still represented as an arrays containing the raw bytes individually formatted as unsigned numbers (see number 3)

I compared this like this and it indeed has this effect:

This does not print unprintable characters but creates a byte array
journalctl --all --output=json --follow _SYSTEMD_USER_UNIT=target.service

While this does print them
journalctl --all --follow _SYSTEMD_USER_UNIT=target.service

There is some more context about this problem in these issues on Github: 1, 2, these issues seem to be undecided for me, it is unclear if it was intended such that --all would print a string for json. Particullarly this PR suggests that it should have been changed.

Thank you in advance,

Kind regards

Robert

otelcol --version
otelcontribcol version 0.95.0-dev

journalctl --version
systemd 252 (252-18.el9)
+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS -FIDO2 +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified

@sumo-drosiek
Copy link
Member

I will have to look at it closer. @RobertFloor Could you share example logs, and what do you expect them to be?
I don't think we can easily switch from JSON output to another one, so I am thinking more about some postprocessing and converting from byte array to string

@RobertFloor
Copy link

Hi unfortunately I cannot provide company logs, but this example shows the gist of it:

# We add binary data to a journalctl log
-bash-4.3# printf "hey\rho\n" | systemd-cat -t HOLA

# It is shown as binary 
-bash-4.3# journalctl -t HOLA --no-pager -q
systemd-testsuite HOLA[251]: [6B blob data]

# JSON logging creates a byte array 
-bash-4.3# journalctl -t HOLA --no-pager -q -o json
{......,"MESSAGE":[104,101,121,13,104,111],"....."}

# JSON logging creates a byte array, --all has no effect 
-bash-4.3# journalctl -t HOLA --no-pager -q -o json --all
{......,"MESSAGE":[104,101,121,13,104,111],"....."}

# This shows the binary data
journalctl -t HOLA --no-pager -q  --all
HOLA[3416096]: hey

This behavior is necessary, otherwise, a non-printable character would break the json structure

$ printf '{"key": "value\rvalue"}' | jq
parse error: Invalid string: control characters from U+0000 through U+001F must be escaped at line 1, column 21

I agree that not using json is impossible, but the --all option will never have an effect as it is implemented currently.

@sumo-drosiek
Copy link
Member

This behavior is necessary, otherwise, a non-printable character would break the json structure

Yes, but we could convert it to the following form (which is correct as I think):

➜  ~ printf '{"key": "value\\rvalue"}' | jq
{
  "key": "value\rvalue"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request receiver/journald
Projects
None yet
Development

No branches or pull requests

5 participants