Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only call ares_library_init()/cleanup() routines in process_wide() if ares extension used in DNS resolving. #21884

Merged
merged 17 commits into from
Jul 6, 2022

Conversation

yanjunxiang-google
Copy link
Contributor

@yanjunxiang-google yanjunxiang-google commented Jun 24, 2022

Only call ares_library_init()/cleanup() routines in process_wide() if ares extension is used in DNS resolving.

Signed-off-by: Yanjun Xiang yanjunxiang@google.com

Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

… ares extension is linked in.

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@yanjunxiang-google
Copy link
Contributor Author

/assign @jmarantz @yanavlasov @mattklein123

@jmarantz
Copy link
Contributor

per IM I am not seeing how this can work.

@yanjunxiang-google
Copy link
Contributor Author

per IM I am not seeing how this can work.

With c-ares library is an extension now, this change helps fixing a bug that if someone build Envoy with c-ares is not linked in here, :

"envoy.network.dns_resolver.cares": "//source/extensions/network/dns_resolver/cares:config",
, the ares_library_init()/ares_library_cleanup() should not be called.

This is not an issue before since ares library is in core. It is a minor issue now since ares extension is by default included.

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@yanjunxiang-google
Copy link
Contributor Author

yanjunxiang-google commented Jun 26, 2022

per IM I am not seeing how this can work.

Thanks for the comments. Please take a look the latest fix.

The latest fix will only call ares_library_init() when the createDnsResolver() is called for c-ares. And it will only call ares_library_cleanup() during ~process_wide() when c-ares createDnsResolver() is ever called.

Overall, this PR can help that if c-ares extension is not linked in, or even is linked in but never used due to configuration, ares_library_init()/ares_library_cleanup() will not be called.

Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I think this is vectoring toward the solution we discussed.

void cleanup() override {
// Cleanup c-ares library if initialized.
if (ares_library_initialized_) {
ares_library_cleanup();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this cleanup to the ares factory destructor? That would feel a little cleaner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't do it in the destructor since factories are statically initialized and will race with whatever static clean-up happens in c-ares.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Can you comment that the destructor is not called in deterministic order as the object is statically constructed.

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@jmarantz
Copy link
Contributor

/wait

@yanjunxiang-google yanjunxiang-google changed the title Only call ares_library_init()/cleanup() routines in process_wide() if ares extension linked in Only call ares_library_init()/cleanup() routines in process_wide() if ares extension used in DNS resolving. Jun 27, 2022
Copy link
Contributor

@yanavlasov yanavlasov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/wait-any

void cleanup() override {
// Cleanup c-ares library if initialized.
if (ares_library_initialized_) {
ares_library_cleanup();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't do it in the destructor since factories are statically initialized and will race with whatever static clean-up happens in c-ares.

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
void cleanup() override {
// Cleanup c-ares library if initialized.
if (ares_library_initialized_) {
ares_library_cleanup();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Can you comment that the destructor is not called in deterministic order as the object is statically constructed.

@yanavlasov
Copy link
Contributor

/wait

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@jmarantz
Copy link
Contributor

/wait

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remember to look at the format error also.

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@jmarantz
Copy link
Contributor

jmarantz commented Jul 1, 2022

/wait

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Copy link
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great modulo naming.

Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
Signed-off-by: Yanjun Xiang <yanjunxiang@google.com>
@yanjunxiang-google
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #21884 (comment) was created by @yanjunxiang-google.

see: more, trace.

@yanjunxiang-google
Copy link
Contributor Author

Thanks jmarantz@ for the comments and approval!

@mattklein123 mattklein123 merged commit 9b424e6 into envoyproxy:main Jul 6, 2022
@yanjunxiang-google yanjunxiang-google deleted the ares-cleanup branch July 6, 2022 16:39
mum4k added a commit to envoyproxy/nighthawk that referenced this pull request Jul 18, 2022
…`. (#877)

Also re-enable `text_gcc`.

Fixes #872.

**Details**:
- `ServiceImpl` creates `Envoy::Logger::Context` which requires `Envoy::Thread::MutexBasicLockable`.
- The `Envoy::Thread::MutexBasicLockable` was constructed after the `Envoy::Logger::Context`, so `Envoy::Logger::Context` became invalid during destruction of `ServiceImpl`.
- this PR ensures that `Envoy::Thread::MutexBasicLockable` lives longer than `Envoy::Logger::Context`.
- Fixing the same problem in the construction of `RequestSourceServiceImpl`.

**Reason for `test_gcc` failure:**
- envoyproxy/envoy#21884 triggered this bug, because it added a logging call initiated indirectly from the destructor of `Envoy::ProcessWide`.
- This logging call happened during the destruction of `ServiceImpl` when the `Envoy::Thread::MutexBasicLockable` instance was already destroyed.
- As a result the PURE virtual methods on its parent object `Envoy::Thread::BasicLockable` were called.
- `test_gcc` reported it because it places a special function meant to detect this behavior in the `vtbl` of unimplemented pure virtual methods.

Signed-off-by: Jakub Sobon <mumak@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants