-
Notifications
You must be signed in to change notification settings - Fork 546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*.gz files are getting corrupted during collection under the 'logs' plugin when obfuscation option is enabled for masking. #3884
Comments
Thank you for reporting this issue @suhastawade . I managed to reproduce this locally. I haven't looked at the code yet but it looks like we may be trying to perform the substitutions directly on the gz file. I'll start looking into this in a bit, but if anyone else wants to work on it just let us know here so we don't overlap. |
We should be removing tarballs entirely - https://github.com/sosreport/sos/blob/main/sos/cleaner/archives/__init__.py#L392 - precisely for this reason that we can't reliably obfuscate binary data without corrupting it. It is interesting that only the IPv6 parser is reporting errors within that obfuscation attempt, though the file should have been removed before even getting to the parser step. |
I think that using the --keep-binary-files could be causing this issue: https://github.com/sosreport/sos/blob/main/sos/cleaner/__init__.py#L692
Shall we remove this check to avoid issues like this? |
This commit tries to honour option --keep-binary-files when using cleaner by making sure that these files are skipped so the cleaner doesn't attempt to apply substitutions directly on them. Related: sosreport#3884 Signed-off-by: Jose Castillo <jcastillo@redhat.com>
I've opened a PR with a possible fix, that tries to honour the --keep-binary-files that seems to work as expected. @suhastawade can you try it? Just remember that we won't try to obfuscate anything from the binary, so if you use that option you need to be really sure you really want whoever receives the files to have them. |
Ah, I didn't see the option being used in the command at first. |
@jcastill - Thanks for taking this on priority. Let me know if you have any further questions / comments. |
If you don't want binary files that may have sensitive information in the report archive after cleaning, then don't use The vast majority of sos use cases do not use clean/mask, and binary data such as tarballs are considered important collections. We remove binary files by default out of the report during cleaning precisely because we cannot reliably obfuscate them, and attempting to is likely to introduce the exact problem you're seeing - corruption. The |
While performing logs collection using sos utility the .gz files from /var/log/ directory are getting collected. But when masking is ON then obfuscation is happening and it is corrupting those .gz files.
As a result those are not getting extracted and not able to see any data out of it.
You can see that the file format of the .gz file is shown as "data" only instead of "gzip compressed data".
This is from my local extracted directory - "sosreport-testappliance-20241215061507-periodic-nvjetqe"
[root@testappliance sosreport-testappliance-20241215061507-periodic-nvjetqe]# file ./var/log/messages-2024121000.gz
./var/log/messages-2024121000.gz: data
Following image shows that during that during cleanup it fails to parse the lines from the .gz files resulting in corrupted data.
Command I used is - sosreport -o logs --clean --keep-binary-files -vvv
<<<
Following is the github location where we are collecting all the files of name "messages*" and "secure*" and because of this the .gz files are also getting collected which results into corrupted files during collection.
https://github.com/sosreport/sos/blob/main/sos/report/plugins/logs.py#L49
https://github.com/sosreport/sos/blob/main/sos/report/plugins/logs.py#L50
sos report version -
[root@testappliance log]# sosreport -v
Please note the 'sosreport' command has been deprecated in favor of the new 'sos' command, E.G. 'sos report'.
Redirecting to 'sos report -v'
sosreport (version 4.7.2)
If required you can also reproduce this issue at your end with having multiple .gz files inside /var/log/ directory.
The text was updated successfully, but these errors were encountered: