Allowing inbound TCP connections to connector containers #840

psFried · 2022-12-12T17:59:18Z

psFried
Dec 12, 2022
Maintainer

@jgraettinger and I have had a side conversation going about push-based ingestion, and I'd like to get it more out in the open and invite feedback and contributions from the broader group.

The high level idea is to reframe push based ingestions as regular captures, by allowing TCP connections from the public internet to running connector containers. The connector container would stand up a web server and use the regular capture protocol to ingest documents in response to web requests. The DPG would act as an L4 load balancer that routes TCP traffic to the container. Working at L4 enables it to work with a wide variety of protocols (Kafka and MQTT would both be pretty appealing). The end goal is that we should be able to handle a new type of ingest protocol just by implementing a new connector for it. There's lots of other interesting possibilities that open up, though, once we allow arbitrary network traffic into containers!

Of course the big questions are around how to route traffic between DGP and the connector container. Here's what we're thinking.

The key enabler is SNI, which is a ubiquitous TLS extension that embeds the hostname of the server in the TLS client hello message. The DPG would use that hostname to map a connection to the proper shard. The hostname would need to embed both the shard id and the data- plane hostname. Something like <pet-name>-<key-begin>-<rclock-begin>.<data- plane>.estuary.dev seems decent, but we could also just use a hash of the shard id as the subdomain, or maybe something else.

Then DPG needs to somehow create a connection to the connector container. There's lots of different ways we could do this. We talked through a number of possible approaches, including using internal DNS, k8s Services, or even embedding another L4 balancer in flow-reactor. It doesn't feel like a simple problem, partly because there's still a lot of uncertainty around relying on k8s. At this point, we're thinking it makes sense to have flow-reactor provide a simple gRPC service that supports connection tunneling. I can expand on this in another comment if anyone wants more of the backstory.

ALPN would also be key to allowing connector containers to listen on multiple ports. An example would be having an HTTP server for a /debug/pprof endpoint on port 80, and also an MQTT server on port 1883. The DPG would need to know which protocols the connector supports, so it could complete ALPN negotiation. It would then specify the chosen protocol when connecting to the broker to create the connection tunnel. The broker uses this protocol name to determine which port to connect to.

With this approach, DPG would not be concerned with authentication of requests. The connector container would do that itself if it chose to. For example, we could have an HTTP ingestion connector that allows you to set a Bearer token as part of the endpoint config, and it would require that requests present that token.

One downside to this approach would be that we couldn't give clients a reasonable error message if the shard's connector container is down. They'd just get something like "TLS handshake error: server offered no protocols". IMO that's not much to give up in the tradeoff, though.

That's pretty much the sketch. In terms of a concrete plan, I think it makes sense to try building a POC so we can kick the tires and evaluate next steps. That POC is probably not an immediate priority, but something we should get to within the next month or so, since it might cause us to re-evaluate our approach on a number of other features.

willdonnelly · 2022-12-12T18:18:22Z

willdonnelly
Dec 12, 2022
Maintainer

To clarify, would the DPG be responsible for terminating TLS or would that happen inside the connectors? And if the latter, how does the connector get its certificate? I'm worried that we might run into per-domain rate limits fairly quickly if we use ACME, unless we're planning on running our own server for that...

4 replies

psFried Dec 12, 2022
Maintainer Author

DPG would need to terminate the TLS connection. That's a requirement because DPG would need to know the hostname that the client is connecting to (uses TLS SNI extension), as well as the desired protocol (uses TLS ALPN). I'm imagining that DPG would have a wildcard certificate for *.<data-plane>.estuary.dev, although that's not how the cert is setup today. ACME V2 does support wildcard domains, but I'm honestly not sure what it'll take to get that working with the golang autocert package.

I don't think connectors ought to be terminating TLS, but I'm open to other ideas on that. It just seems like TLS termination is a simplifying capability for the platform to provide, and one that users have come to expect. Although there's technically nothing preventing us from also using TLS between the DPG and flow-reactor.

jgraettinger Dec 12, 2022
Maintainer

One note is we'll need a separate domain, like estuary-users.dev, to separate connectors from cookies of our domain. Still, connectors won't have that certificate and DPG needs to terminate.

We may (?) want to allow connectors to use / declare self-signed TLS though between DPG and the container, to make e.g. HTTP/2 easier.

willdonnelly Dec 12, 2022
Maintainer

If the DPG is terminating TLS, it seems like

One downside to this approach would be that we couldn't give clients a reasonable error message if the shard's connector container is down. They'd just get something like "TLS handshake error: server offered no protocols".

might be incorrect then? What stops us from forwarding any requests for not-currently-present connectors to a trivial little HTTPS service that always responds with 503 connector not running? Or is the issue what we do with other non-HTTPS protocols over TLS?

psFried Dec 15, 2022
Maintainer Author

What stops us from forwarding any requests for not-currently-present connectors to a trivial little HTTPS service that always responds with 503 connector not running? Or is the issue what we do with other non-HTTPS protocols over TLS?

Yeah, I think your idea makes a lot of sense for HTTP, and I think we should do that. Your second point is more what I was referring to. It seems unrealistic to come up with error responses for every protocol, and in the absence of any protocol-specific handler, our only other option is to basically just close the connection.

psFried · 2022-12-14T21:32:18Z

psFried
Dec 14, 2022
Maintainer Author

Just wanted to update here after @jgraettinger and I discussed another detail.
Not all protocols use TLS as a transparent encryption layer over TCP. Examples of this are SSH, Postgres, and MySQL, which all begin their own protocol handshakes before establishing TLS encryption.

There's plenty of prior art for how to handle this, and I found teleport's design particularly informative. Riffing on that a bit, I think we'd want something like flowctl tasks port-forward --task acmeCo/foo. The idea would be that the subcommand would establish a TLS connection to DPG and print the local address. So you could connect to, say, a postgres container by running that and connecting to the local address. As a bit of neat UX, we could even allow spawning the port-forwarding process in the background so you could run psql "$(flowctl tasks port-forward --task acmeCo/my-db --background)".

0 replies

psFried · 2022-12-19T23:19:09Z

psFried
Dec 19, 2022
Maintainer Author

Another aspect I'd like to talk though is TLS certificates. Our current approach to TLS certificates is to use the autocert package to provision certificates automatically. This works, but it's subject to limits that will prevent it from scaling beyond a certain point. For example, you can only have up to 50 certificates per registered domain (i.e. for all of estuary.dev). We could try to combine multiple hostnames onto a single certificate, but you can only do up to 100, and then we'd need to manage mapping hosts to different certs. 🤢

The obvious solution here is to just use a wildcard certificate, such as for *.estuary-combustible-cronut.dev. This is supported by Let's Encrypt, but it requires the use of the dns-01 challenge type. This challenge type is bascially where the CA (Let's Encrypt) gives you a text value, which you need to put into a DNS TXT record for __acme-challenge.estuary-combustible-cronut.dev. Now, the autocert packge doesn't support DNS challenges, but the lego project looks nicer, anyway, and it does support DNS challenges. But the rub is really just that we'd need to have a supported DNS provider, which would likely mean switching from google domains to google cloud (sigh... why are they even different in the first place?), or else to some other DNS provider. I'm guessing that switching to google cloud is probably going to be the simplest option, but I'm wondering if skord has any opinions here?

2 replies

psFried Jan 4, 2023
Maintainer Author

I'd like to clarify the issues around DNS and TLS, and suggest a concrete plan for moving forward.

Practically, we need to use a wildcard TLS certificate. As Johnny pointed out, we also need a separate domain so that cookies for *.estuary.dev will not be sent to any old connector that's running on our platform. So I think that means we need to get a new domain (e.g. estuary-data.dev is available, and I'll just continue with that example). So combustible-cronut.estuary-data.dev, new-data-plane.estuary-data.dev, etc. I don't see any reason why we shouldn't also use combustible-cronut.estuary-data.dev for all the existing DPG services, too. So we'd have all the DPG gRPC and HTTP services on combustible-cronut.estuary-data.dev, and connections to connector containers would be on *.combustible-cronut.estuary-data.dev.

Ultimately, we'll need DNS records for both combustible-cronut.estuary-data.dev and *.combustible-cronut.estuary-data.dev. And we'll also need a TLS certificate for the same. Why am I bringing up DNS and TLS together here? Well, if we want to continue having DPG provision its own TLS certificates using Let's Encrypt, then we need to use the dns-01 ACME challenge type. This challenge requires that you prove ownership of the domain by dynamically creating a DNS TXT record. There's a few issues we need to get past in order to use a dns-01 challenge.

The autocert package doesn't support dns-01, so we'd need to switch to a different implementation.
- The Lego project supports dns-01, and seems to be more popular and better documented.
We'd need to change how we manage DNS records from Google Domains to Google Cloud. "Domains" isn't supported by Lego or anything else, but Google Cloud is.

Those don't seem like a big deal to me, and we'd get to keep our "automagical" TLS cert provisioning. So I think the overall plan would be something like:

Buy estuary-data.dev, or whatever other name we decide on for data planes.
Setup Google Cloud DNS for estuary-data.dev.
- IIUC, we can setup a separate "managed zone" for each data-plane, and setup a service account that has access to CRUD DNS records for only that data plane.
Manually configure Cloud DNS to resolve (*.)combustible-cronut.estuary-data.dev to the IPs of the load balancer that fronts combustible-cronut's DPG.
Setup Lego to run in the DPG image and provision our TLS Certs.
- This requires a GCP service account that can CRUD the DNS records for the challenge. But note that this isn't going to touch the DNS A (or AAAA) records.
- Lego can be used either as a Go library or a CLI. Using it as a CLI sounds like plan A, since it would keep the DPG code cleaner and eases a potential re-write in Rust.

AFAIK that's all that'd cover all the prerequisites. We'll still have to decide on exactly how we want to map the subdomain in some-subdomain.combustible-cronut.estuary-data.dev to an actual shard. And also how to actually tunnel the traffic between DPG and the connector process. But I'll save those for a separate comment.

psFried Jan 4, 2023
Maintainer Author

Since we're revisiting the TLS cert provisioning anyway, I'm actually pretty tempted by Certmanager. It's another thing to install in k8s, but it's extremely robust and simple to integrate with any application. Keeping cert provisioning out of the DPG code seems simpler and more secure, given the need to muck with DNS records. Then DPG just reads its certs out of files, and doesn't have to worry about provisioning or renewals. I doubt that Certmanager would raise any eyebrows in a user-owned data plane, but they'd be able to customize or replace that implementation if they wanted to.

psFried · 2023-01-20T20:25:56Z

psFried
Jan 20, 2023
Maintainer Author

IT'S ALIVE! ...finally
Connector network proxying is "working" end-to-end. There's still some loose ends to tie up, but I think now's a good time to share some of the details and get feedback on the UX. I'll start with a basic description of how to use this from a user's perspective, and then get into some of the details and open questions. Please let me know if you have any feedback or questions on any of this.

High-level guide / example

Start with a connector that listens on a port. To start with a very simple use case, I hacked up a source-conn-test connector (based on source-hello-world) to expose a /debug/pprof handler that listens on port 7777. I added the following code:

// in imports
_ "net/http/pprof"

// in the `doRead` function
go func() {
	var hErr = http.ListenAndServe("0.0.0.0:7777", nil)
	log.Fatalf("http error: %v", hErr)
}()

Then publish a catalog that uses that connector and exposes the port, as in this example:

collections:
  'aliceCo/some-collection':
    schema:
      type: object
      properties:
        count: { type: integer }
        message: { type: string }
      required: [count, message]
    key: [/count]

captures:
  'aliceCo/hello-world':
    endpoint:
      connector:
        image: ghcr.io/estuary/source-conn-test:local
        config:
          greetings: 1
        ports:
          pprof:
            port: 7777
            protocol: http/1.1
    bindings:
    - resource:
        stream: greetings
        syncMode: incremental
      target: aliceCo/some-collection

I'm then able to curl -vvvv -k 'https://pprof.649cecde1931c2a6.localhost:28318/debug/pprof' and get the pprof responses! Well, sort of... there's still a known bug that's preventing the procotol from the flow.yaml from being negotiated in ALPN, so I actually need to add the --http1.1 option to curl until that's fixed.

The ports section in flow.yaml is used to indicate any ports that the connector wishes to expose to the internet. Each port here must have a name that's unique to the task. In the example above, the name of the port is pprof. This name can be anything you want, as long as it's a valid DNS label. The port property is required, and it names the port that the connector will listen on. This name will end up being a component of the hostname, as seen in the above URL.

Note that any port configured in the spec will be exposed to the public internet, without any form of authorization check. In the future, we should be able to build support for authorization checks in DPG using TLS client certificates, which might be desireable for certain scenarios. But I've considered that (far) out of scope for now. We'll want to be very clear in our documentation, though, because it would be easy for some naughty person to do all sorts of harm if they were able to connect to a debug port of a running connector, for example.

How it works

At build time:
There's a bit of validation for the ports to ensure that they have valid names. When we generate shard templates, any task having exposed ports will result in shards having the estuary.dev/hostname label set. Currently, the value for this is generated by hashing the task (not shard) name and converting it to hexidecimal. See errata for more considerations about this. For each exposed port, we also add a label like estuary.dev/port/<portName> where the value is the port number from the spec. If the spec defines a protocol, then we'll also generate a label like estuary.dev/alpn-proto/<portName> where the value is the name of the protocol for DPG to use for ALPN.

At runtime (DPG):
When a client connects to DPG using TLS, it sends the desired hostname as part of the TLS client hello message. This is just regular SNI, so nothing special. A curl command like the one above would result in a server name in the client hello like pprof.649cecde1931c2a6.localhost. DPG uses the GetConfigForClient function in the TLS config to lookup the shard based on that hostname.

This lookup is done using the shard labels that were populated at build time. Specifically, DPG does a shard listing for any shards having the labels: estuary.dev/hostname == 649cecde1931c2a6, and estuary.dev/port/pprof having any value. If there's no shards with a PRIMARY assignment, then the connection is refused at this point. This listing is allowed to return multiple shards, and in that case DPG will select a running one at random. DPG caches these lookups for a short period of time. This turned out to be necessary for $reasons (see errata). Once it's selected a shard, it uses the port labels on that shard to determine a set of ALPN protocols to use for completing the TLS handshake. If the shard labels do not specify a protocol for the port, then DPG will just agree to whatever the client requests. The goal is to avoid users needing to think or know about ALPN, or specify a port protocol at all, in most cases. HTTP is actually the trickiest case, and is discussed in errata.

Once the TLS handshake is complete (more errata), it then starts a bi-directional streaming RPC with the reactor to which the shard is currently assigned. This RPC has an inital handshake that allows the reactor to validate that the connector is (still) assigned and running and allows accessing the given port. If this Proxy RPC handshake doesn't complete successfully, the connetion is closed. If it's successful, then DPG just starts copying data from the client connection to the RPC and vice versa.

At runtime (reactor):
Reactor run a gRPC Proxy service, which tunnels TCP traffic through to another Proxy service that's run by connector-init. The Proxy implementation in reactor uses a consumer.Resolver to ensure that the ShardID given during the initial proxy handshake has a PRIMARY assigment on the current reactor instance. If not, it rejects the request and sends back a Header identifying the new primary. The reactor's Proxy service also keeps a record of which containers are running, and the ports they expose, so it can immediately reject proxy requests for containers that aren't running or don't expose the requested port. Assuming all is goood there, it pretty much just uses the existing grpc.ClientConn that's associated with the running container to proxy all the messages to connector-init.

Connector-init needs to know the port mappings up front, in order for it to translate from the named port in the proxy request to an actual port number that the connector is listening on. I added an --expose <portName>=<number> CLI argument to connector-init. Reactor adds these arguments automatically when starting a container (not for unary RPCs, though). The Proxy implementation in connector-init just connects to the port that the connector is listening on and forwards data between the gRPC protocol and the actual tcp socket.

The plan from here

Address any feedback y'all have at this point
Finish cleaning up the code, adding tests, fixing issues, etc.
Setup a parallel DPG deployment in combustible-cronut, to allow for testing and validation
- configure cert-mananger for DNS-01 challenges
- update ops repo to add v3 dpg with new hostname and tls certs
- Verify that everything is working
Cut over to the new DPG deployment by updating the configured URL in the control-plane DB

UX questions

How should a user learn about the hostname that was generated for a given task? My proposal is that we add an HTTP endpoint to DPG that returns all of the hostnames for a given named task after verifying that the JWT authorizes the user to write to the task. This lets it work with our existing authz, and avoids the need for control-plane to even know about the actual hostnames. That's desirable because of shard splits, which the control plane is unaware of.

I'm thinking it may be good to support an additional, more succinct representation of ports where the value can simply be an integer instead of an object with port and protocol properties. It's not really clear to me how commonly users will need to actually specify the protocol, though, so I didn't actually implement this yet. I'm kinda leaning toward leaving it out for now.

I could see an argument for putting ports under the shards property in the flow.yaml because it feels like a pretty generic thing that ought to apply to all task types. Of course shards applies to derivations, which don't currently support exposing ports. But there's no reason we shouldn't support exposing ports for (especially v2) derivations.

Errata

DPG can no longer handle HTTP/1.1 requests (edit: disregard)

DPG now needs to make decisions about how to handle things at the connection level. This interacts poorly with both the existing code, and with Go's standard HTTP handlers. The http2 package provides a ServeConn function, which can be used to handle http2 requests for a given connection. But there is no similar function for http/1.1. I'm sure we can make 1.1 work if we do some refactoring to route things through a custom net.Listener, but doing so didn't seem worth the effort at this time.

Edit: It turns out that local builds were already not really working, and I needed to do some refactoring in order to get grpc working locally, anyway. So I refactored the code to use a custom net.Listener and now the plain port can handle both http/1.1, http2, and gRPC (using h2c).

Why is cacheing shard resolutions necessary?

Taken from the code comment:

// An important part of exposing ports is that you can set the alpn protocol. This is used
// during the TLS handshake to help clients negotiate between the various protocols that all
// operate using TLS. So we need to know which alpn protocols to offer for a given hostname
// as part of the GetConfigForClient callback. Knowing this requires resolving the SNI hostname
// to a specific shard, whose labeling whill be used to validate the connection request.
// We _also_ need to resolve that shard in order to know which endpoint to connect to for
// dialing the grpc.ClientConn. The problem is that there's no way to pass state from that
// GetConfigForClient callback into the rest of the process. So
// we have GetConfigForClient do the shard resolution and cache the result.
// GetConfigForClient will call the GetAlpnProtocols function in order to see
// which alpn protocols should be offered for the connection.
// GetAlpnProtocols will validate the hostname and fetch the shard (and selecting
// from among multiple matched shards, if necessary), cacheing the result.
// HandleProxyConnection will then lookup in the cache to continue the connection process.
// But the shard listing results include transient information, such as status, which
// cannot be cached for very long. The TTL on the cache then acts on a limit to the total
// amount of time required to complete the TLS handshake. Probs not a biggie, but worth knowing.
// The other thing is that it's an LRU cache since we need to ensure that the entries are going to be present
// for at least time enough for a network round trip (to complete the TLS handshake). This means that,
// concurrent connection requests for _n_ distinct hosts could start to fail if n > cacheLimit.
// I see this as the failure mode just being load shedding, though it does make this number ...complicated.
const SHARD_RESOLUTION_CACHE_MAX_SIZE = 1024
const SHARD_RESOLUTION_CACHE_MAX_AGE = 30 * time.Second

HTTP and ALPN

Many use cases can likely be served without users needing to specify a protocol at all. Unfortunatly, HTTP will require it. The TLDR is that guidance to users should be something like:

If your connector is using HTTP, then you should explicitly specify either http/1.1 or h2. Use h2 if your connector is using h2c. Otherwise, specify http/1.1 explicitly. Otherwise, clients will typically offer both protocols as part of ALPN, and Flow won't know which one to pick.
If your connector is using gRPC, then specify h2 explicitly. It's possible that some gRPC clients might work without it, but I'm guessing it'd be required for at least most of them.

For the vast majority of cases, there shouldn't ever be a need to specify more than one ALPN protocol per port. The one exception I can think of is h2c. Note that, while h2c is technically included in the list of ALPN protocol names, it is explicitly not allowed to be used during an actual protocol negotiation. No client will send h2c as part of it's client hello. So if your connector want to use h2c in such a way as to support both http/1.1 and h2, then you need to specify both of those protocols as a comma-separated list. I chose not to represent protocols as an array because this was the only case I could think of where it would be necessary, and even here it's not necessarily clear that this is the best approach. For example, we could choose to instead handle h2c as a special case in DPG, which expands to ["h2","http/1.1"].

Why wait until the TLS handshake is complete before starting the proxy RPC?

It'd be faster if we could start the Proxy RPC handshake while the TLS handshake is completing. This turns out not to be possible, given how Go's tls package works. Even though the ServerName is provided as part of the initial client hello, the tls.ConnectionState will not have the ServerName property set until after the TLS handshake completes. We need that server name in order to lookup the shard in the resolution cache, so we just have to wait until it's available.

Handling split shards

I haven't yet done anything to handle split shards. If you try to connect to a host that has multiple shards, then DPG just picks one at random. I think this is actually fine for a first pass, but we'll eventually want to support connecting to specific shards of a task. This should be pretty easy once we settle on a way to represent the shard's key and rclock begin values as part of the domain name. Something like <keyBegin>-<rclockBegin>.<port>.<shardHostname>.<domainSuffix> ought to be fine. But we could also do <port>.<keyBegin>-<rclockBegin>-<shardHostname>.<domainSuffix>. Shard labels are a multiset, so there'd be nothing preventing us from adding multiple estuary.dev/hostname labels. Or we can have DPG translate the hostnames into queries that specify estuary.dev/key-begin|rclock-begin selectors. I'm inclined to not worry about this just yet, and address it in a future PR.

3 replies

psFried Jan 20, 2023
Maintainer Author

One other thing I just thought of is that we should also somehow deal with the fact that connector-init itself listens on port 8080. I'm thinking perhaps we should just validate during the build that no ports use 8080, and return an error if they do. It's a little weird to say "you can use any port except for 8080", but we're always going to need to reserve some port for connector-init (unless we reverse the directionality of the protocol 😜), and it seems best to validate up front that users don't try to expose that port.

jgraettinger Jan 20, 2023
Maintainer

Awesome! Excited to dig into this. A lot here and I haven't reviewed much yet, but one question for you: Dockerfile also have a means of enumerating exposed ports, and both connector-init and the reactor have access to that via image inspection output. Is it reasonable or even possible to let the image be authoritative on ports?

psFried Jan 20, 2023
Maintainer Author

Yeah I did consider that, but there's two reasons why I thought not to go that route. The main thing is that, given the lack of any built-in authentication, it seemed unwise to expose all ports by default. Ideally we'd be able to expose something like a debug port for only as long as necessary. The other thing is that the Dockerfile EXPOSE directive doesn't allow names or ALPN protocols to be associated with a port number. The name part doesn't have to be a dealbreaker (we could use the port number directly in the hostname, instead of a name). But I can't see any way to allow both HTTP and GRPC to work without some explicit configuration of the ALPN protocol to use for the port.

psFried · 2023-01-21T01:38:12Z

psFried
Jan 21, 2023
Maintainer Author

I realized I'd forgotten to address an aspect of the security considerations here.
If you expose a port, then anyone who knows the hostname will be able to connect to that port.
As it is, our hostnames are just a hash of the task name, so anyone could know it they know the catalog name.

We could add a salt (and perhaps use a different hash algo) to make the hostnames more "secret". But there would still be significant vulnerabilities remaining, which would make me think twice before exposing a debug port to the open internet without any authentication. DNS queries and the tls client hello will both typically include the full plaintext hostname.

So I think at this point, our stance should be: don't expose any sensitive debug ports unless the connector itself is enforcing some sort of authentication, and generally treating it's network traffic as "untrusted". This seems like a bit of a bummer, because as a connector developer I wouldn't want to pull in a ton of auth-enforcement code just for a debug endpoint. The dangers of exposing the port are easy to forget when the hostname seems to resemble an unknowable secret. You might think it's fine to just expose the port for a brief period of time, and then un-expose it when you're done debugging. And that might honestly be fine in some cases. But I don't think just using a "secret" domain label would ever be considered "secure". IANAE, so I'd be especially hesitant to rely on that approach, even on a short term basis.

So I think we want to aim for having connectors all enforce authentication, even (especially) for things like debug ports (like for a debugger, pprof is maybe not as bad). It'd be nice if we could figure out some common code for dealing with that authentication, and re-use it across all our connectors. I'll leave that for a separate discussion, though.

TLS Client Authentication offers another possible path for exposing things like debug ports securely, while allowing authentication to be handled entirely by DPG. I've considered this very far out of scope for the short term, but I mention here as a possible future solution if wrapping all connector endpoints with authentication becomes too onerous. The downside would be that control-plane would need to get involved with managing certificates, so the scope does not seem small.

0 replies

psFried · 2023-02-06T17:38:58Z

psFried
Feb 6, 2023
Maintainer Author

I had a good conversation with @jgraettinger last week, and we've worked out some ways to simplify this feature and make it easier to use. The biggest change is that it now seems motivated to provide an HTTP proxy server. Doing so enables DPG to enforce authZ based on JWTs issued by the control plane. This in turn can enable easy and seamless integration of services running in connector containers with the Flow UI. It also gives us a way to have "private" ports, which are only accessible to authenticated users, without needing to implement and configure authentication in every single connector, at least for HTTP.

If we're able to provide authenticated access to ports, then there's also less of a reason to force users to configure the exposed ports up front. Put aother way, if we're going to authenticate them first, why not just let them connect to any port (except for connector-init's port, of course)?

Of course we can't ship a proxy for every protocol, so we'd say that anything other than http would just be treated as plain old TCP.

But I think this raises the bar for the UX a bit, and the framing of ports in the models in the POC no longer seems ideal. In the POC, you have to specify all the ports you want to expose up front. That's a bit of a bummer, because it means you need to first publish an update to the spec before you can connect to a port, which seems pointless if you want to speak http, and can thus rely on our authz to determine if the access ought to be allowed.

So the new stance is that you don't need to have ports at all. Every shard will have an estuary.dev/hostname label. When DPG accepts a connection, it will use the following logic:

It will still resolve the shard based on SNI and shard labels.
If will determine if the port visibility is public or private. In order to be public, a port needs explicit configuration in the models::Catalog that says so. Everything else is considered private.
It will determine whether to handle the request as HTTP (including both 1.1 and 2) or plain TCP based on the negotiated ALPN protocol
Handling the connection:
- If the connection protocol is HTTP and the visibility is private, then the HTTP proxy will require a valid JWT that is signed by the control plane and has claims allowing write access to the task.
- If the connection protocol is HTTP and the visibility is public, then the HTTP proxy will proxy away without checking any credentials.
- If the connection protocol is anything other than http/1.1 or h2, then it will validate than the port configuration is public and close the connection if not. If it is public, then traffic will be proxied only as plain TCP.

From a user's perspective, there's no longer any need to explicitly configure ports that you want to be able to connect to, as long as those ports are for HTTP. For debug ports and such, you just get an auth token and connect. If you want to use another protocol besides http, then you need to mark the port as public: true in the spec. A port with public: true will not check or enforce any authentication or authorization, regardless of the protocol.

0 replies

psFried · 2023-02-28T23:22:10Z

psFried
Feb 28, 2023
Maintainer Author

Some thoughts on hashes for shard names and an eventual migration to pet-names.

Background:
This is about the host names that users can use to connect to their containers running in the data plane. A typical hostname currently looks something like 8e1a694b22ca6050-8080.us-central1.v1.estuary-data.dev. Everything after the first . is just the hostname for the data-plane-gateway for our data plane. Everything before that first . is the subdomain that data-plane-gateway uses to identify a target shard and port for the connection. In more detail, that subdomain is broken out into two values¹, separated by a -. The first value identifies the task, and the second one is the port number (inside the container). From data-plane-gateway's perspective, it just uses that task identifier (in this example 8e1a694b22ca6050) to query for shards with a label of estuary.dev/hostname matching that value. So this is really about how we populate those labels.

The labels are currently determined by this code in the assemble crate. The current process is to hash the task name and use the hexidecimal value. In the future, we'd like to transition to using "pet-names" instead. Pet names are a "randomly" generated name that's intended to be human readable. combustible-cronut and it's venerable forerunner thundering-waffle are both examples of pet names. There's been a plan for a while now to eventually assign pet-names to tasks in the control plane. This is needed in order to provide the ability to rename tasks, since the pet-name would be a stable identifier. So we'd like to eventually also use those same pet names in the hostnames instead of the task name hash.

The question that @jgraettinger brought up was about the transition from the hashes to the pet names. I obviously don't have a detailed answer, since there are still many unknowns about how we'll implement pet names. But what I can say is that we will definitely need to know the pet name for each task at the point where we generate the shard labels.

One way to approach that would be to have a migration that assigns the current hash as the pet name for existing tasks that expose public ports. The thing about pet names is that they need to be persisted by the control plane, and somehow passed in to the publication process to be turned into shard labels. They're also still just opaque ids, so it's fine if a few pre-existing tasks have pet names that are hexidecimal.

Alternatively, we could just update the publish process to preserve any existing estuary.dev/hostname labels, making them additive with the pet names. A pre-existing shard might end up with two values for estuary.dev/hostname: 8e1a694b22ca6050 and fancy-kaleidoscope for example. The data plane gateway would then allow either name to be used.

I think for right now, it's enough to just know that we'll have some plausible options when it comes time to do the pet-name transition.

Aside: there's also a variant that has 4 values instead of just 2. The 2 value variant just maps to any shard for the task, picking one at random if there are multiple shards for the task. In the 4 value variant, the two additional values represent the key-begin and rclock-begin values that identify a specific shard. This is useful if you want to connect to a debug port on a specific shard. ↩

3 replies

jgraettinger Mar 1, 2023
Maintainer

A pre-existing shard might end up with two values for estuary.dev/hostname: 8e1a694b22ca6050 and fancy-kaleidoscope for example. The data plane gateway would then allow either name to be used.

That's perfect -- that fully resolves the concern from my perspective (I'd forgotten for a moment that labels are a multi-set 🤦).

jgraettinger Mar 1, 2023
Maintainer

The way I'd pictured rolling out a pet-name migration would be to set the pet-name of existing specs to just be their current spec name. That's required, because we can't change out the path that fragments are written under for existing specs, and only net-new specs would have pet-sets like fancy-kaleidoscope. I suppose that means that existing tasks will have pet names that are incompatible with DNS names. BUT we could also use that to selectively hash only incompatible pet-names (which are, in practice, current task names), while passing through others, and that would be a non-breaking change.

psFried Mar 7, 2023
Maintainer Author

yeah, that sounds like a good plan 👍

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allowing inbound TCP connections to connector containers #840

{{title}}

Replies: 7 comments 12 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Allowing inbound TCP connections to connector containers #840

psFried Dec 12, 2022 Maintainer

Replies: 7 comments · 12 replies

willdonnelly Dec 12, 2022 Maintainer

psFried Dec 12, 2022 Maintainer Author

jgraettinger Dec 12, 2022 Maintainer

willdonnelly Dec 12, 2022 Maintainer

psFried Dec 15, 2022 Maintainer Author

psFried Dec 14, 2022 Maintainer Author

psFried Dec 19, 2022 Maintainer Author

psFried Jan 4, 2023 Maintainer Author

psFried Jan 4, 2023 Maintainer Author

psFried Jan 20, 2023 Maintainer Author

High-level guide / example

How it works

The plan from here

UX questions

Errata

psFried Jan 20, 2023 Maintainer Author

jgraettinger Jan 20, 2023 Maintainer

psFried Jan 20, 2023 Maintainer Author

psFried Jan 21, 2023 Maintainer Author

psFried Feb 6, 2023 Maintainer Author

psFried Feb 28, 2023 Maintainer Author

Footnotes

jgraettinger Mar 1, 2023 Maintainer

jgraettinger Mar 1, 2023 Maintainer

psFried Mar 7, 2023 Maintainer Author

psFried
Dec 12, 2022
Maintainer

Replies: 7 comments 12 replies

willdonnelly
Dec 12, 2022
Maintainer

psFried Dec 12, 2022
Maintainer Author

jgraettinger Dec 12, 2022
Maintainer

willdonnelly Dec 12, 2022
Maintainer

psFried Dec 15, 2022
Maintainer Author

psFried
Dec 14, 2022
Maintainer Author

psFried
Dec 19, 2022
Maintainer Author

psFried Jan 4, 2023
Maintainer Author

psFried Jan 4, 2023
Maintainer Author

psFried
Jan 20, 2023
Maintainer Author

psFried Jan 20, 2023
Maintainer Author

jgraettinger Jan 20, 2023
Maintainer

psFried Jan 20, 2023
Maintainer Author

psFried
Jan 21, 2023
Maintainer Author

psFried
Feb 6, 2023
Maintainer Author

psFried
Feb 28, 2023
Maintainer Author

jgraettinger Mar 1, 2023
Maintainer

jgraettinger Mar 1, 2023
Maintainer

psFried Mar 7, 2023
Maintainer Author