Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

needrestart corrupts rpm database - somehow #100

Closed
mphilipps opened this issue Jan 25, 2018 · 11 comments
Closed

needrestart corrupts rpm database - somehow #100

mphilipps opened this issue Jan 25, 2018 · 11 comments

Comments

@mphilipps
Copy link

hi,
After spending all trying to figure how something kept corrupting the rpm database on 2 of our RHEL 7.4 systems I finally tracked it down to needrestart that was being run every 5 minutes by nagios nrpe.
I furthermore debugged it to the point that I found out that 20-rpm is called with "/usr/libexec/xdg-permission-store;5a384735" as it is argument. I have no idea how needrestart picked up the ;5a384735 part nor how that corrupts the rpm database. I have reduced 20-rpm to this
https://gist.github.com/mphilipps/ebffc76ece6852bc326b98dcf34d135f .

Calling ./20-rpm "/usr/libexec/xdg-permission-store;5a384735" is enough to corrupt the database.

Calling rpmquery --file "/usr/libexec/xdg-permission-store;5a384735" however gives me a normal error message, telling me that there is no such file or directory.

To fix the database between tests I am doing:

rm /var/lib/rpm/__db.00? && rpm -qa && yum clean all && rm -rf /var/cache/yum && yum info cowsay

Mostlikely unrelated to this:
My rpm -q/rpmquery has no --filesbypkg parameter.

Tested with needrestart 2.11, RHEL 7.4, RPM 4.11.3

@liske
Copy link
Owner

liske commented Jan 27, 2018

Is this reproducable on other hosts? Could you please provide the verbose output of needrestart -r l -v?

@mphilipps
Copy link
Author

It is reproducable on 2 out of ~18 hosts.

https://gist.github.com/mphilipps/c41c0b7766c49ce98510976a5595b479

@liske
Copy link
Owner

liske commented Feb 3, 2018

Thanks, could you please provide the content of /proc/$PID/maps of one of the processes using deleted files with this suspicious filename suffix?

@mphilipps
Copy link
Author

@taladar
Copy link

taladar commented Feb 9, 2018

I am a coworker of @mphilipps

I just noticed that the numbers after the file mappings in /proc/$PID/maps when interpreted as a hexadecimal unix timestamp seem to match the time when the package manager replaced those libraries (deleted them most likely) with new versions. Maybe this will help figure out their origin.

@taladar
Copy link

taladar commented Feb 9, 2018

Apparently rpmquery is a symlink to rpm which does have a --filesbypkg query option.

liske added a commit that referenced this issue Feb 11, 2018
@liske
Copy link
Owner

liske commented Feb 11, 2018

Thanks for the update, I did just wonder about the strange filenames... but it sounds like it is just the way how rpm works - nothing to worry about. If the rpm database gets corrupted it might be a problem within rpm... the corruption should not triggered by a bunch of rpm -q ... calls of needrestart, shouldn't it?

@taladar
Copy link

taladar commented Feb 11, 2018

Well, I did have a quick look at the rpm source code and couldn't find a spot where it would rename files to names with timestamps or semicolons. So I am still not sure if that is somehow part of /proc/pid/maps or of rpm's way of replacing files. I know that it is not all deleted files that have those suffixes in /proc/pid/maps and as @mphilipps pointed out, the problem doesn't seem to happen when calling rpmquery outside the hook.

Is there anything special about the way the hook calls rpmquery that might differ from calling it on the commandline? Perl syntax is not our strength.

@liske
Copy link
Owner

liske commented Feb 11, 2018

There should be nothing special (although stdout of rpm is a pipe, not a TTY).

The hook does:

  • calls rpm -q -file /path/to/the/binary to find the package providing the affected binary
  • after reading all output it closes the file handle
  • calls rpm -q --filesbypkg PKGNAME where PKGNAME is the name of the package from the first step
  • after reading all output it closes the file handle

I have no idea how this could corrupt rpm's database.

@liske
Copy link
Owner

liske commented Feb 11, 2018

Well, I did have a quick look at the rpm source code and couldn't find a spot where it would rename files to names with timestamps or semicolons.

I think I've found it at the rpm sources.

@mphilipps
Copy link
Author

Thank you for your help, but I think at this point is fair to say that the issue is outside the scope of needrestart.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants