-
Notifications
You must be signed in to change notification settings - Fork 140
[FEATURE] KOP Proxy Design document #717
Comments
Overall LGTM but a question. How does the proxy look up the current coordinator? |
@sijie Could you also take a look? |
My general reaction to the proposal is: We shouldn't try to create another proxy solution for Kafka protocol for a couple of reasons:
|
KoP is designed for providing Kafka-compatible protocol for Kafka clients. We shouldn't reinvent a proxy solution ourselves. We should adopt the "proxy" solution what has been proven and used in the Kafka community. |
It follows the same rules as the "findCoordinator" implementation in the KOP PH. It looks up the owner for the topic that is mapped to the group (this is why I send my API refactor PR, in order to be able to use the same logic). |
The fact that no one has done it before doesn't mean that we cannot introduce a new software. If the user is fine with the trade-offs, then I believe that it is good to provide the users a good built-in Proxy for Kafka over Pulsar. With a proxy like this you can use Kafka clients with Pulsar out-of-the-box, no need to add other third party components/services, only Pulsar + the KOP .nar files and a bunch of configuration entries in broker.conf and proxy.conf. Currently my testing show that using KOP behind this kind of proxy works really well, and from the user (sys admin, sys architect...) perspective it is amazing as it fits well the Pulsar picture. Probably not all the users will want to use this approach, that's fine for me. |
What KOP proxy is doing can already be achieved by using existing proxy software. Why do you want to implement this again? Especially you are implementing it using Java language which has its own deficiency comparing to other proxy software like Envoy.
The problem mostly exists in Kubernetes world. Envoy and Istio is the de-factor standard for proxies and request routing. In Kubernetes world, people use helm chart and operators to deploy Pulsar (and KoP). We have included this in our helm chart. People can easily install a Pulsar cluster with KOP enabled in Kubernetes using that helm chart. Why do you think a KoP proxy can simplify this process? If you are deploying KoP in an on-prem cluster, you don't need a KoP proxy at all. So I don't know what are the actual value brought in by KoP proxy comparing to the existing solution. Instead, I see it introduces additional complexity that we need to maintain.
Can you show why Envoy is not able to address what you are doing? Why KOP proxy is doing better than using Envoy? |
IMO, if the proxy module was not instructive to the |
As said, probably in some other usecases the user may be good with using something else, but those 4 points cannot be addressed with something that is not "part of Pulsar" (probably point 4 to some extent) |
Yes, my idea is to add a new "proxy" Maven module and add new tests that leverage existing tests ('extends') but launch a proxy and route all the Kafka traffic through the proxy instead of connecting directly to the KOP broker (not all tests are applicable) |
If the purpose of proxy is to help client outside Kubernetes cluster to connect to brokers inside, I think we could use advertised listener, let client connect to broker directly and get the correct lookup result which the client could reconnect to. I'm working on #669 , could that be a solution for you? |
Proxy design guiding principles
Authentication and connections to the Brokers
The Kafka Client connects only to the Proxy (possibly to several Proxies depending on k8s/DNS…..)
Authentication is performed only against the Proxy, that keeps track of the User (Principal) and it forwards such Identity information to the Pulsar (KOP) broker which opens the connection to the Broker. We are going to open one TCP connection per Kafka client connection per Broker. (see #711)
Encryption - TLS Support
The KOP Proxy allows configuring SSL and SASL_SSL listeners the same as for the KOP Broker.
The configuration for TLS is the same as for the Pulsar Proxy, this way you can use the same certificates.
We can implement support for using the same configuration entries as for the KOP Protocol Handler.
Topic Owner Lookup
When the Kafka client refers to a Topic the Request is forwarded to the Broker that actually is the Owner of the Topic, opening a new Connection if needed, this can be done using Pulsar Lookup API.
For metadata requests the response will be restricted to the list of topics that can be reached by the Authenticated user: this should be already handed in the KOP Protocol handler.
Message forwarding
The Kafka client is supposed to lookup the Leader broker for a partition and to issue Produce and Fetch requests directly to the Leader broker. The proxy can easily forward the request to the KOP broker that is the owner of the Topic/Partition.
See the protocol here:
https://kafka.apache.org/protocol.html
Please note that in the “Produce Request” the Broker ID/Address is not cited: the Kafka client is supposed to connect to the Broker that is the leader and then send the Produce PDU.
In this case in the Metadata Response we are always responding with the Proxy address as Leader Broker, so the client is always forced to connect to the Proxy (any instance of the Proxy that answers to the address reported in the Metadata Response for the partition).
In case of a "Produce Request" to multiple topics the Proxy must process each Topic (possibly grouping them by Leader broker) and then compose a final result.
For the case of one single topic it is enough to pass the Request to the leader broker.
See the next paragraph for a more detailed explanation.
Splitting the Requests on the Proxy
In the case of the two main APIs: Produce and Fetch (to consume messages) we have this flow (it is basically the same for Produce and Fetch, the example talks about Produce):
The client wants to send a few “records” to several partitions
The ProduceRequest API allows you to batch the requests:
ProduceRequest:
The constraint from the Client Point of View is that the Broker must be the leader of every partition in the ProduceRequest
In the case of the KOP Proxy, the Client thinks that the KOP Proxy is the leader of every topic and every partition in the world, so it batches the writes as much as possible.
The Proxy has two cases:
-- We can forward the PDU to the Broker and proxy the request, transparently
-- We have to split the PDU
-- Send the smaller PDUs to each Broker, in parallel
-- Wait for all the brokers
-- Recompose the Response
-- Send the Response
Requests for Coordinators
We have many request types that should be forwarded to a "Coordinator" (Group/Transaction).
In this case the proxy looks up the current coordinator for the given "group" and forwards the request to the KOP Broker accordingly.
The text was updated successfully, but these errors were encountered: