Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial DNS proxy implementation #536

Merged
merged 1 commit into from
Jun 23, 2023
Merged

Conversation

nmittler
Copy link
Contributor

@nmittler nmittler commented May 29, 2023

This copies much of the logic from Istio, however it also supports a multi-tenant proxy.

New crate

The new istio-dns crate extends the Trust-DNS API to specifically support the DNS proxy use case. The existing Trust-DNS Catalog does not allow an Authority to conditionally indicate its answers as authoritative, which is something that a proxy needs to do. The code here was written with the intent of eventually upstreaming to Trust-DNS.

Host aliases

Within ztunnel, a new dns package implements the istio-dns Resolver, and serves DNS directly from the WorkloadStore.

The logic for handling host aliases has been somewhat inverted from Istio. Istio pre-generated aliases for each client and added entries for all possible hosts to the lookup table.

However, in shared proxy mode we have to handle the problem that some host aliases are client-specific (e.g. just service-name without namespace). To account for this, we dynamically run the alias logic in reverse, trying to figure out the FQDN from the requested hostname. This means that no additional entries for aliases were necessary in the WorkloadStore.

Forwarding

Ztunnel dns uses one of two types of DNS forwarders, depending on if shared or dedicated mode. When in shared mode, it needs to use the configuration for the client pod in order to forward to the appropriate upstream resolver.

Fixes #487

@nmittler nmittler added the do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. label May 29, 2023
@nmittler nmittler requested a review from a team as a code owner May 29, 2023 20:18
@istio-testing istio-testing added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label May 29, 2023
crates/dns/src/handler.rs Outdated Show resolved Hide resolved
crates/dns/src/proxy/address_table.rs Outdated Show resolved Hide resolved
crates/dns/src/multi_tenant.rs Outdated Show resolved Hide resolved
crates/dns/src/multi_tenant.rs Outdated Show resolved Hide resolved
crates/dns/src/handler.rs Outdated Show resolved Hide resolved
crates/dns/src/kube.rs Outdated Show resolved Hide resolved
crates/dns/src/kube.rs Outdated Show resolved Hide resolved
@nmittler nmittler force-pushed the dns branch 12 times, most recently from a16c08c to de51a9e Compare June 7, 2023 20:00
@nmittler nmittler changed the title [WIP] Initial DNS proxy logic. [WIP] Initial DNS proxy implementation Jun 7, 2023
@nmittler
Copy link
Contributor Author

nmittler commented Jun 7, 2023

FYI @arvindsree @shakti67

@nmittler nmittler force-pushed the dns branch 3 times, most recently from f186881 to 4e53ef4 Compare June 7, 2023 20:32
Copy link
Contributor

@bleggett bleggett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just minor comments after a cursory look, what's here seems good.


/// A Trust-DNS [ResponseHandler] that proxies all DNS requests.
///
/// A proxy is fundamentally different than an `Authority` in TrustDNS, since the answers may
Copy link
Contributor

@bleggett bleggett Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know if upstream would be amenable to this in theory? If not, can we ask?

it's hard to grok the cost of carrying this (more-verbose) workaround if we don't have a read on whether it can ever be upstreamed or not.

Copy link
Contributor Author

@nmittler nmittler Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to land this PR before I approach the Trust-DNS folks, since I'd like some evidence that it actually works :). It definitely seems to be a use case they hadn't considered, so I think it's worth at least having the discussion. Even if they don't accept it, it's not a lot of code and it's all stuff that needs to be done anyway if we're going to use Trust-DNS.

Comment on lines 83 to 84
// We are the authority here, since we control DNS for known hostnames
response_header.set_authoritative(answer.is_authoritative());
Copy link
Contributor

@bleggett bleggett Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(possibly stupid Q from a non-DNS expert) Does this mimic the current Go DNS logic? Is this normal? Are we going to confuse anything if some responses are marked as authoritative and some not, from what the client (believes is) the same DNS server?

Since (AFAIK) the whole DNS server is either authoritative or not, usually?

Copy link
Contributor Author

@nmittler nmittler Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The behavior should be the same as the Istio code: https://github.com/istio/istio/blob/5aa2900dca83190962aea2eac8712ca88ca63da3/pkg/dns/client/dns.go#L289

Typically an entire server is authoritative or not ... but that's where a DNS proxy is a little different. We're only authoritative over the records we know about and defer to the upstream resolver for everything else.

src/proxy/dns.rs Outdated Show resolved Hide resolved
src/service.rs Outdated Show resolved Hide resolved
src/workload.rs Outdated Show resolved Hide resolved
nmittler added a commit to nmittler/ztunnel that referenced this pull request Jun 8, 2023
This is a cleanup resulting from istio#536. This moves all service-related datastructures to a sub repository called `ServiceStore`. It has a simple API, which is used by the `WorkloadStore` and clarifies the contract between the two.

The separation also helps to generally simplify the `WorkloadStore`, which was becoming increasingly monolithic.
nmittler added a commit to nmittler/ztunnel that referenced this pull request Jun 8, 2023
This is a cleanup resulting from istio#536. This moves all service-related datastructures to a sub repository called `ServiceStore`. It has a simple API, which is used by the `WorkloadStore` and clarifies the contract between the two.

The separation also helps to generally simplify the `WorkloadStore`, which was becoming increasingly monolithic.
nmittler added a commit to nmittler/ztunnel that referenced this pull request Jun 8, 2023
This is a cleanup resulting from istio#536. This moves all service-related datastructures to a sub repository called `ServiceStore`. It has a simple API, which is used by the `WorkloadStore` and clarifies the contract between the two.

The separation also helps to generally simplify the `WorkloadStore`, which was becoming increasingly monolithic.
nmittler added a commit to nmittler/ztunnel that referenced this pull request Jun 8, 2023
This is a cleanup resulting from istio#536. This moves all service-related datastructures to a sub repository called `ServiceStore`. It has a simple API, which is used by the `WorkloadStore` and clarifies the contract between the two.

The separation also helps to generally simplify the `WorkloadStore`, which was becoming increasingly monolithic.
@nmittler nmittler force-pushed the dns branch 2 times, most recently from 1ddf53c to 1762057 Compare June 20, 2023 18:28
@nmittler
Copy link
Contributor Author

@howardjohn @kdorosh @bleggett I believe this is ready to go. The relevant Istio DNS tests were copied here and are passing. There are a couple of tests missing (due to current system limitations), which are marked as TODO with associated bugs.

Tests are currently unit/integration-level (DNS client/server using loopback) .. they do not use iptables/eBPF redirection. Happy to add it if folks can suggest how/where best to do so... otherwise, we can address in a follow-on PR.

@nmittler nmittler changed the title [WIP] Initial DNS proxy implementation Initial DNS proxy implementation Jun 20, 2023
@istio-testing istio-testing removed the do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. label Jun 20, 2023
@bleggett
Copy link
Contributor

bleggett commented Jun 20, 2023

@howardjohn @kdorosh @bleggett I believe this is ready to go. The relevant Istio DNS tests were copied here and are passing. There are a couple of tests missing (due to current system limitations), which are marked as TODO with associated bugs.

Tests are currently unit/integration-level (DNS client/server using loopback) .. they do not use iptables/eBPF redirection. Happy to add it if folks can suggest how/where best to do so... otherwise, we can address in a follow-on PR.

I think a follow-on PR is fine - ideally we want to just re-run the existingistio/istio ambient integ test suite with DNS capture enabled via ISTIO_META_DNS_CAPTURE and make sure everything still works in that suite, at a minimum, and that change wouldn't live in this repo anyway (it should be pretty small, can mostly reuse the approach used for running the same test suite with eBPF configs)

I'd like that to be a followup as a Definition-Of-Done just to prove (now as well as ongoing) that ambient DNS redirection works at all and to keep us from breaking things.

This PR LGTM tho.

@nmittler
Copy link
Contributor Author

FYI I should be able to test truncation after this lands: https://github.com/bluejekyll/trust-dns/pull/1975

@howardjohn
Copy link
Member

make test doesn't actually test the crate as far as I can tell, and running the test locally fails

@howardjohn
Copy link
Member

Ideally the fix is not in Makefile either as then folks will miss it when they run cargo test

Copy link
Member

@howardjohn howardjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly reviewed the impact on existing codepaths as this already has some LGTMs, I don't have a lot of time to thoroughly review the whole change, and I don't want to block it. Looks safe. main comment is the testing issue of the additional crate

@@ -67,7 +69,7 @@ impl Proxy {
drain: Watch,
) -> Result<Proxy, Error> {
let mut pi = ProxyInputs {
cfg,
cfg: cfg.clone(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cfg: cfg.clone(),
cfg,

src/proxy.rs Outdated Show resolved Hide resolved
const DNS_CAPTURE_METADATA: &str = "ISTIO_META_DNS_CAPTURE";
let dns_enabled = pi
.cfg
.proxy_metadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cfg should probably have a dns_proxy_enabled (or perhaps Option<SocketAddr>) and we just use it directly?

Also I think the istio_meta_ is stripped (

pc.proxy_metadata = pc
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping to leverage the existing configuration for DNS capture for now, rather than introducing something new. I don't have a strong opinion here.

@bleggett thoughts?

Regarding using proxy metadata (since it seems I'm probably not using it correctly here), do we have other examples in ztunnel that use it? If we have constants defined somewhere, I can add the DNS one there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mean anything user facing, just we currently process all settings into cfg struct and now we are parsing them on-demand

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok sure .. just trying to figure out the desired end state here. Are you suggesting that I read the metadata and then set the flag on the config based on that?

/// Runs this DNS proxy to completion.
pub async fn run(self) {
// TODO(nmittler): Do we need to use drain?
if let Err(e) = self.server.block_until_done().await {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What marks it "done"? Do wen eed to close the listeners ourself?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we get an error on sigterm:

WARN ztunnel::proxy::dns: DNS server shutdown error: Internal error in spawn: task 4 was cancelled

Copy link
Contributor Author

@nmittler nmittler Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, probably worth upstreaming a fix here. I'm guessing the behavior we want would be similar to our draining logic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe yeah. It probably matters less for DNS though since there is less persistent connections (none for UDP)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised an issue with Trust-DNS: https://github.com/bluejekyll/trust-dns/issues/1976. Let me know if the ask sounds right to you.

@nmittler
Copy link
Contributor Author

@howardjohn

make test doesn't actually test the crate as far as I can tell, and running the test locally fails

Ah right. I think the "correct" way to support multiple crates would be to use a workspace. That has a number of advantages anyway, such as mananging one version for depdencies across crates.

Anyway, since the code in the crate was really something that should be upstreamed anyway .. I'll just get rid of the extra crate for now.

This copies much of the logic from Istio, however it also supports a multi-tenant proxy.

The new `istio-dns` crate extends the Trust-DNS API to specifically support the DNS proxy use case. The existing Trust-DNS `Catalog` does not allow an `Authority` to conditionally indicate its answers as authoritative, which is something that a proxy needs to do. The code here was written with the intent of eventually upstreaming to Trust-DNS.

Within ztunnel, a new `dns` package implements the `istio-dns` `Resolver`, and serves DNS directly from the `WorkloadStore`.

The logic for handling host aliases has been somewhat inverted from Istio. Istio pre-generated aliases for each client and added entries for all possible hosts to the lookup table.

However, in shared proxy mode we have to handle the problem that some host aliases are client-specific (e.g. just service-name without namespace). To account for this, we dynamically run the alias logic in reverse, trying to figure out the FQDN from the requested hostname. This means that no additional entries for aliases were necessary in the `WorkloadStore`.

Ztunnel dns uses one of two types of DNS forwarders, depending on if shared or dedicated mode. When in shared mode, it needs to use the configuration for the client pod in order to forward to the appropriate upstream resolver.

Fixes istio#487
@istio-testing istio-testing merged commit 63aafbf into istio:master Jun 23, 2023
GregHanson pushed a commit to GregHanson/istio that referenced this pull request Jun 23, 2023
Adds support for a subset of the service entry api to ambient mesh.
The api supported is as follows:
- addresses (the VIPS)
- auto VIP address allocation (addresses is optional)
- endpoints (static only)
- location (prefer mesh external for now for DIRECT passthrough in ztunnel)
- workloadSelector (selects pods and workload entries)

notable exclusions:
- hosts (coming per istio/ztunnel#536)
- exportTo
- resolution (everything is static)
- subjectAltNames (pending ztunnel support)

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
istio-testing pushed a commit to istio/istio that referenced this pull request Jul 7, 2023
* ambient: service entry initial impl

Adds support for a subset of the service entry api to ambient mesh.
The api supported is as follows:
- addresses (the VIPS)
- auto VIP address allocation (addresses is optional)
- endpoints (static only)
- location (prefer mesh external for now for DIRECT passthrough in ztunnel)
- workloadSelector (selects pods and workload entries)

notable exclusions:
- hosts (coming per istio/ztunnel#536)
- exportTo
- resolution (everything is static)
- subjectAltNames (pending ztunnel support)

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* update more tests to use common functions

* add release note

* Prefer updates.Insert()

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Move lock lower so there's less lock contention

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Ambient SE handler should work regardless of EnableK8SServiceSelectWorkloadEntries setting

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Update test after local testing for auto assign vips

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Stronger assertion for load balancing

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Fix unit tests for ndots in test app

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Remove dead code

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* DRY SE ports construction

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* DRY some internal network address conversion code

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Stop testing with load balancing; tests run super slowly with that

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Ensure service entries cannot select across namespaces

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Remove customizations for skipped test

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Optimize pod xds events if not selected by service entry

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Optimize workload entry xds events if not selected by service entry

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Move logic for HandleServiceEntry into AmbientIndex interface

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Don't pass around entire ambient index

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Prefer ValidatedSetSelector for performance

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Rename function to be clearer

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Don't pass around ambient index child

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

* Remove auto vip allocation

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>

---------

Signed-off-by: Kevin Dorosh <kevin.dorosh@solo.io>
Co-authored-by: Kevin Dorosh <kevin.dorosh@solo.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DNS in Ambient Mesh
5 participants