Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check_http[368389]: segfault at 0 ip 00007f3f707f7e65 sp 00007fffeebc9af0 error 4 in ld-musl-x86_64.so.1[53e65,7f3f707b8000+54000] likely on CPU 0 (core 0, socket 0)[Bug]: #76

Open
CRCinAU opened this issue Aug 25, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@CRCinAU
Copy link

CRCinAU commented Aug 25, 2024

What happened?

When running images built with alpine, I see consistent errors in the docker host dmesg output as follows:

[137253.413827] check_http[366642]: segfault at 0 ip 00007f8c6ec82e65 sp 00007ffc167b5660 error 4 in ld-musl-x86_64.so.1[53e65,7f8c6ec43000+54000] likely on CPU 2 (core 2, socket 0)
[137253.413838] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137265.214338] check_http[366704]: segfault at 0 ip 00007fa2c0c8fe65 sp 00007fff30082cd0 error 4 in ld-musl-x86_64.so.1[53e65,7fa2c0c50000+54000] likely on CPU 3 (core 3, socket 0)
[137265.214350] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137265.333189] check_http[366710]: segfault at 0 ip 00007f034d1fce65 sp 00007ffe4bae2480 error 4 in ld-musl-x86_64.so.1[53e65,7f034d1bd000+54000] likely on CPU 2 (core 2, socket 0)
[137265.333200] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137284.198562] check_http[366765]: segfault at 0 ip 00007f0ac89d6e65 sp 00007fff6d5ea300 error 4 in ld-musl-x86_64.so.1[53e65,7f0ac8997000+54000] likely on CPU 1 (core 1, socket 0)
[137284.198573] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137284.346434] check_http[366770]: segfault at 0 ip 00007f91fbad9e65 sp 00007ffd2dbfb2b0 error 4 in ld-musl-x86_64.so.1[53e65,7f91fba9a000+54000] likely on CPU 1 (core 1, socket 0)
[137284.346443] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137484.872482] check_http[367289]: segfault at 0 ip 00007f2e14cafe65 sp 00007ffd1f6e9710 error 4 in ld-musl-x86_64.so.1[53e65,7f2e14c70000+54000] likely on CPU 3 (core 3, socket 0)
[137484.872493] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137485.029566] check_http[367296]: segfault at 0 ip 00007f0859014e65 sp 00007ffc0e844c80 error 4 in ld-musl-x86_64.so.1[53e65,7f0858fd5000+54000] likely on CPU 3 (core 3, socket 0)
[137485.029577] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137544.372844] check_http[367432]: segfault at 0 ip 00007fb7d8e9de65 sp 00007ffdd50b2820 error 4 in ld-musl-x86_64.so.1[53e65,7fb7d8e5e000+54000] likely on CPU 3 (core 3, socket 0)
[137544.372854] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137558.851710] check_http[367484]: segfault at 0 ip 00007efe34d4de65 sp 00007fff7ee02280 error 4 in ld-musl-x86_64.so.1[53e65,7efe34d0e000+54000] likely on CPU 0 (core 0, socket 0)
[137558.851719] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137559.034037] check_http[367490]: segfault at 0 ip 00007f1a74385e65 sp 00007ffce3f69e90 error 4 in ld-musl-x86_64.so.1[53e65,7f1a74346000+54000] likely on CPU 3 (core 3, socket 0)
[137559.034049] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137583.903845] check_http[367540]: segfault at 0 ip 00007f8d778b6e65 sp 00007ffe1b43db90 error 4 in ld-musl-x86_64.so.1[53e65,7f8d77877000+54000] likely on CPU 0 (core 0, socket 0)
[137583.903854] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137584.059743] check_http[367546]: segfault at 0 ip 00007fd680893e65 sp 00007fffbc87e8f0 error 4 in ld-musl-x86_64.so.1[53e65,7fd680854000+54000] likely on CPU 3 (core 3, socket 0)
[137584.059753] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137779.728616] check_http[368389]: segfault at 0 ip 00007f3f707f7e65 sp 00007fffeebc9af0 error 4 in ld-musl-x86_64.so.1[53e65,7f3f707b8000+54000] likely on CPU 0 (core 0, socket 0)
[137779.728624] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137779.875546] check_http[368395]: segfault at 0 ip 00007f409a8a6e65 sp 00007ffe1895b060 error 4 in ld-musl-x86_64.so.1[53e65,7f409a867000+54000] likely on CPU 2 (core 2, socket 0)
[137779.875556] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137842.464971] check_http[368531]: segfault at 0 ip 00007fc3864d7e65 sp 00007ffd78ae08c0 error 4 in ld-musl-x86_64.so.1[53e65,7fc386498000+54000] likely on CPU 3 (core 3, socket 0)
[137842.464982] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137856.946092] check_http[368582]: segfault at 0 ip 00007f261e97de65 sp 00007ffe3ee55040 error 4 in ld-musl-x86_64.so.1[53e65,7f261e93e000+54000] likely on CPU 0 (core 0, socket 0)
[137856.946103] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137857.068576] check_http[368588]: segfault at 0 ip 00007fd9730e7e65 sp 00007ffffbaed770 error 4 in ld-musl-x86_64.so.1[53e65,7fd9730a8000+54000] likely on CPU 0 (core 0, socket 0)
[137857.068594] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137881.173431] check_http[368633]: segfault at 0 ip 00007f773f005e65 sp 00007ffc7e8f20a0 error 4 in ld-musl-x86_64.so.1[53e65,7f773efc6000+54000] likely on CPU 1 (core 1, socket 0)
[137881.173442] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[137881.334368] check_http[368638]: segfault at 0 ip 00007f2338e30e65 sp 00007ffd6f742170 error 4 in ld-musl-x86_64.so.1[53e65,7f2338df1000+54000] likely on CPU 0 (core 0, socket 0)
[137881.334379] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[138079.558441] check_http[369434]: segfault at 0 ip 00007fbe2f29ae65 sp 00007fff00904160 error 4 in ld-musl-x86_64.so.1[53e65,7fbe2f25b000+54000] likely on CPU 0 (core 0, socket 0)
[138079.558454] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63
[138079.705605] check_http[369441]: segfault at 0 ip 00007fc7c5498e65 sp 00007ffe75aaee00 error 4 in ld-musl-x86_64.so.1[53e65,7fc7c5459000+54000] likely on CPU 1 (core 1, socket 0)
[138079.705615] Code: 89 eb 48 85 c0 74 05 f6 00 20 74 32 85 ed 74 54 49 89 de b8 ff ff ff 7f 44 29 f8 44 39 d8 0f 8c a2 08 00 00 41 b8 ff ff ff 7f <41> 0f b6 06 45 01 df 84 c0 0f 84 b2 0b 00 00 4c 89 f5 eb 85 48 63

I rebuilt everything on alpine:edge - but still get the same thing.

Apparently, it has something to do with versions of ld-musl mismatching - but I can't quite figure out where that would be...

The plugin is still functional, so likely this is mostly cosmetic...

Image information

N/A

Image architecture

amd64

Relevant log output

N/A
@CRCinAU CRCinAU added the bug Something isn't working label Aug 25, 2024
@manios
Copy link
Owner

manios commented Aug 30, 2024

Hi @CRCinAU !

Could you please provide steps to reproduce this issue?

I am running the latest and build-21 tags for the last 30 minutes in my amd64 PC and I cannot see any logs of that kind in my host. The command:

sudo dmesg  | egrep segfault

returns nothing. So maybe something is going on in your docker host?

Best regards,
Chris

@CRCinAU
Copy link
Author

CRCinAU commented Aug 30, 2024

Interesting - I moved everything back to your stock images and added a couple of extra modules via the method in #80 - but I still see them...

I've been trying to replicate this and narrow it down to just running check_http manually - and have been able to trigger it again.... But not reliably...

The systemd logs at the time are:

Aug 31 07:11:04 (sd-parse-elf)[4833]: Could not parse number of program headers from core file: invalid `Elf' handle
Aug 31 07:11:04 (sd-parse-elf)[4833]: Could not parse number of program headers from core file: invalid `Elf' handle
Aug 31 07:11:04 (sd-parse-elf)[4833]: Could not parse number of program headers from core file: invalid `Elf' handle
Aug 31 07:11:04 (sd-parse-elf)[4833]: Could not parse number of program headers from core file: invalid `Elf' handle
Aug 31 07:11:04 (sd-parse-elf)[4833]: Could not parse number of program headers from core file: invalid `Elf' handle
Aug 31 07:11:04 systemd-coredump[4832]: [🡕] Process 4827 (check_http) of user 100 dumped core.
                                                       
                                                       Module /opt/nagios/libexec/check_http without build-id.
                                                       Module /opt/nagios/libexec/check_http
                                                       Module /lib/libcrypto.so.1.1 without build-id.
                                                       Module /lib/libcrypto.so.1.1
                                                       Module /usr/lib/libintl.so.8.1.7 without build-id.
                                                       Module /usr/lib/libintl.so.8.1.7
                                                       Module /lib/libssl.so.1.1 without build-id.
                                                       Module /lib/libssl.so.1.1
                                                       Module /lib/ld-musl-x86_64.so.1 without build-id.
                                                       Module /lib/ld-musl-x86_64.so.1
                                                       Stack trace of thread 897:
                                                       #0  0x00007f9aa5f6d7fe n/a (/lib/ld-musl-x86_64.so.1 + 0x4a7fe)
                                                       ELF object binary architecture: AMD x86-64
Aug 31 07:11:04 systemd[1]: systemd-coredump@13-4831-0.service: Deactivated successfully.

I did manage to trigger it more often running the following from within the docker container:

while true; do ./check_http -4 -H my.web.server; done

I'm wondering if it has to do with parallel instances of check_http?

EDIT: From the output of coredumpctl on the host, it looks like this happens quite often:

TIME                          PID UID GID SIG     COREFILE EXE                              SIZE
Sat 2024-08-31 07:18:22 AEST 1851 100 101 SIGSEGV present  /opt/nagios/libexec/check_http 100.6K
Sat 2024-08-31 07:18:22 AEST 1856 100 101 SIGSEGV present  /opt/nagios/libexec/check_http 100.6K
Sat 2024-08-31 07:18:24 AEST 1868 100 101 SIGSEGV present  /opt/nagios/libexec/check_http 100.3K
Sat 2024-08-31 07:18:25 AEST 1872 100 101 SIGSEGV present  /opt/nagios/libexec/check_http 100.4K
Sat 2024-08-31 07:20:37 AEST 2599 100 101 SIGSEGV present  /opt/nagios/libexec/check_http 100.3K
Sat 2024-08-31 07:21:04 AEST 2734 100 101 SIGSEGV present  /opt/nagios/libexec/check_http 100.4K
Sat 2024-08-31 07:21:04 AEST 2739 100 101 SIGSEGV present  /opt/nagios/libexec/check_http 100.3K

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants