Skip to content

Conversation

@jrushford
Copy link
Contributor

@jrushford jrushford commented Apr 28, 2022

Fixes #8807. I've asked Jeremy Payne, jp557198, to test and comment here as he is the issue owner.

This was originally fixed in ATS 9.2.x with PR #8365 but #8365
was big and not backported to 9.1.x as it was a late PR and
zwoop did not want to bring it into 9.1.x at that time.

Anyway, this fixes the issue where the failure count on a parent
is not incremented properly.

This was originally fixed in ATS 9.2.x with PR apache#8365 but apache#8365
was big and not backported to 9.1.x as it was a late PR and
zwoop did not want to bring it into 9.1.x at that time.

Anyway, this fixes the issue where the failure count on a parent
is not incremented properly.
@jrushford jrushford requested a review from bryancall as a code owner April 28, 2022 18:51
@jrushford jrushford self-assigned this Apr 28, 2022
@jrushford jrushford changed the title Fixes issue #8807 nexthop failure threshold. Fixes issue #8807 nexthop failure threshold in ATS 9.1.x Apr 28, 2022
@jrushford
Copy link
Contributor Author

[approve ci fedora]

@jrushford
Copy link
Contributor Author

[approve ci rocky]

@jrushford
Copy link
Contributor Author

[approve ci rocky]

@jrushford
Copy link
Contributor Author

[approve ci fedora]

@jrushford
Copy link
Contributor Author

[approve ci rocky]

@traeak
Copy link
Contributor

traeak commented Apr 29, 2022

[approve ci fedora]

@traeak
Copy link
Contributor

traeak commented Apr 29, 2022

[approve ci rocky]

@jrushford
Copy link
Contributor Author

[approve ci fedora]

@bneradt
Copy link
Contributor

bneradt commented Apr 29, 2022

Both the Fedora and Rockylinux builds seem to be failing due to something introduced by this patch, not transient CI issues.

Fedora:
https://ci.trafficserver.apache.org/job/Github_Builds/job/fedora/881/console

In file included from /usr/include/signal.h:328,
                 from ../../../tests/include/catch.hpp:8034,
                 from unit_tests/unit_test_main.cc:25:
../../../tests/include/catch.hpp:10822:58: error: call to non-'constexpr' function 'long int sysconf(int)'
10822 |     static constexpr std::size_t sigStackSize = 32768 >= MINSIGSTKSZ ? 32768 : MINSIGSTKSZ;
      |                                                          ^~~~~~~~~~~
In file included from /usr/include/bits/sigstksz.h:24,
                 from /usr/include/signal.h:328,
                 from ../../../tests/include/catch.hpp:8034,
                 from unit_tests/unit_test_main.cc:25:
/usr/include/unistd.h:640:17: note: 'long int sysconf(int)' declared here
  640 | extern long int sysconf (int __name) __THROW;
      |                 ^~~~~~~
In file included from unit_tests/unit_test_main.cc:25:
../../../tests/include/catch.hpp:10881:45: error: size of array 'altStackMem' is not an integral constant-expression
10881 |     char FatalConditionHandler::altStackMem[sigStackSize] = {};
      |                                             ^~~~~~~~~~~~
make[2]: *** [Makefile:974: unit_tests/test_tscpputil-unit_test_main.o] Error 1
make[2]: Leaving directory '/home/jenkins/workspace/Github_Builds/fedora/src/src/tscpp/util'
make[1]: *** [Makefile:1313: check-am] Error 2
make[1]: Leaving directory '/home/jenkins/workspace/Github_Builds/fedora/src/src/tscpp/util'
make: *** [Makefile:865: check-recursive] Error 1

RockyLinux:
https://ci.trafficserver.apache.org/job/Github_Builds/job/rocky/279/console

FAIL: test_librecords
=====================

/home/jenkins/workspace/Github_Builds/rocky/src/lib/records/.libs/test_librecords: error while loading shared libraries: libssl.so.81.1.1: cannot open shared object file: No such file or directory
FAIL test_librecords (exit status: 127)

OK, nevermind this. I didn't notice that this patch is for 9.1.x. Probably 9.1.x needs an updated Catch for the newer compilers.

@bneradt
Copy link
Contributor

bneradt commented Apr 29, 2022

Through no fault of this PR, the 9.1.x branch needs a few PRs merged in before these unit tests will build and run with the fedora:35 and rockylinux compilers:

Once those test-only changes are merged in, then the unit tests should finish successfully for this PR (they did for me locally).

@bryancall bryancall requested a review from ywkaras May 2, 2022 23:10
@jp557198
Copy link

jp557198 commented May 3, 2022

Fixes #8807. I've asked Jeremy Payne, jp557198, to test and comment here as he is the issue owner.

This was originally fixed in ATS 9.2.x with PR #8365 but #8365 was big and not backported to 9.1.x as it was a late PR and zwoop did not want to bring it into 9.1.x at that time.

Anyway, this fixes the issue where the failure count on a parent is not incremented properly.

I tested your code changes against our internal branch. Failure threshold now increments as expected.
See below debug output confirming the same.

#8807 (comment)

@ywkaras ywkaras linked an issue May 12, 2022 that may be closed by this pull request
@ywkaras ywkaras closed this May 12, 2022
@ywkaras ywkaras reopened this May 12, 2022
@ywkaras ywkaras marked this pull request as draft May 23, 2022 23:13
@randall
Copy link
Contributor

randall commented Jun 15, 2022

[approve ci autest]

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. Marking it stale to flag it for further consideration by the community.

@github-actions github-actions bot added the Stale label Sep 14, 2022
@github-actions github-actions bot closed this Sep 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ATS 9.1.2 - parent_select/nexthop - failure threshold

6 participants