Fix false crash logs with regression tests #12783

bneradt · 2026-01-06T17:24:44Z

When traffic_server exits normally (e.g., after regression tests complete), traffic_crashlog was incorrectly logging a crash because it detected its parent process had terminated. This happened because traffic_crashlog uses PR_SET_PDEATHSIG to wake up when traffic_server exits, but it couldn't distinguish between a crash (where crash_logger_invoke sends signal info via the pipe) and a normal exit (where the pipe is simply closed). This fix adds a poll() check on stdin to verify that crash data was actually sent before logging a crash, preventing false positive crash logs.

The poll() check from apache#12783 relied on POLLIN to detect crash data, but this is unreliable - on some systems poll() sets POLLIN for EOF (socket closed without data). Instead, we now verify crash data by actually reading from stdin: if read() returns 0 bytes in wait_mode, traffic_server exited normally and crash_logger_invoke was never called, so we exit without logging. This definitively distinguishes a real crash (data written then pipe closed) from a normal exit (pipe just closed).

The poll() check from apache#12783 relied on POLLIN to detect crash data, but this is unreliable - on some systems poll() sets POLLIN for EOF (socket closed without data). Instead, we now verify crash data by actually reading from stdin: if read() returns 0 bytes in wait_mode, traffic_server exited normally and crash_logger_invoke was never called, so we exit without logging. This definitively distinguishes a real crash (data written then pipe closed) from a normal exit (pipe just closed). (cherry picked from commit 9edd4df)

The poll() check from apache#12783 relied on POLLIN to detect crash data, but this is unreliable - on some systems poll() sets POLLIN for EOF (socket closed without data). Instead, we now verify crash data by actually reading from stdin: if read() returns 0 bytes in wait_mode, traffic_server exited normally and crash_logger_invoke was never called, so we exit without logging. This actually (as opposed to the previous patch) distinguishes a real crash (data written then pipe closed) from a normal exit (pipe just closed).

The poll() check from apache#12783 relied on POLLIN to detect crash data, but this is unreliable - on some systems poll() sets POLLIN for EOF (socket closed without data). Instead, we now verify crash data by actually reading from stdin: if read() returns 0 bytes in wait_mode, traffic_server exited normally and crash_logger_invoke was never called, so we exit without logging. This actually (as opposed to the previous patch) distinguishes a real crash (data written then pipe closed) from a normal exit (pipe just closed). (cherry picked from commit 49d8702)

The poll() check from apache#12783 relied on POLLIN to detect crash data, but this is unreliable - on some systems poll() sets POLLIN for EOF (socket closed without data). This tweak adds reading on the socket to make sure that there is data being sent to verify that there was indeed a crash. This actually (as opposed to the previous patch) distinguishes a real crash (data written then pipe closed) from a normal exit (pipe just closed).

The poll() check from apache#12783 relied on POLLIN to detect crash data, but this is unreliable - on some systems poll() sets POLLIN for EOF (socket closed without data). Instead, we now verify crash data by actually reading from stdin: if read() returns 0 bytes in wait_mode, traffic_server exited normally and crash_logger_invoke was never called, so we exit without logging. This actually (as opposed to the previous patch) distinguishes a real crash (data written then pipe closed) from a normal exit (pipe just closed). (cherry picked from commit 49d8702)

This reverts commit dd3455c.

This reverts commit dd3455c. Addressing this turns out to be more complicated than it at first appeared. I realize after more testing that while this patch removed the "false" crash log reports, it broke crash log generally. And attempts at fixing that haven't been very fruitful. Maybe we should look into this in the future, but in the meantime I want to restore things to the previous state so that crash logs will at least be created when there is an actual crash.

bneradt added this to the 10.2.0 milestone Jan 6, 2026

bneradt self-assigned this Jan 6, 2026

bneradt added Tools Crash labels Jan 6, 2026

bneradt requested a review from serrislew January 6, 2026 17:26

serrislew approved these changes Jan 6, 2026

View reviewed changes

bneradt merged commit dd3455c into apache:master Jan 6, 2026
15 checks passed

bneradt deleted the address_regression_test_crashlog branch January 6, 2026 20:02

bneradt mentioned this pull request Jan 7, 2026

Fix crash logs: verify crash data was received #12785

Closed

bneradt added a commit that referenced this pull request Jan 8, 2026

Revert "Fix false crash logs with regression tests (#12783)"

53255d6

This reverts commit dd3455c.

bneradt mentioned this pull request Jan 8, 2026

Revert "Fix false crash logs with regression tests" #12793

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix false crash logs with regression tests #12783

Fix false crash logs with regression tests #12783

Uh oh!

bneradt commented Jan 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix false crash logs with regression tests #12783

Fix false crash logs with regression tests #12783

Uh oh!

Conversation

bneradt commented Jan 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants