Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GSSAPI SASL mechanism based on gokrb5/v8 #598

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

viasat-akozhevnikov
Copy link

This PR implements GSSAPI/Kerberos support.

I previously commented in PR #563 from my personal account (@mentalisttraceur) about how we also implemented GSSAPI/Kerberos support at Viasat and just hadn't yet had time to create the PR yet - this is that PR.


Key differences from PR #563:

  1. Supports Kafka clusters with multiple brokers.

  2. The Kerberos client is dependency-injected and managed externally.

  3. Some of the GSSAPI internals are explained more thoroughly in the code comments.

  4. Supports authenticating with either username+password or keytab file.


I won't take anything personally, so:

  • If you disagree with something I did or have criticisms/critique/concerns, or want me to change something, feel free to bring it up. For example, if we come up with a better way than dependency-injecting kerberos client objects from gokrb5, I'll be happy to implement it.

  • I don't care which GSSAPI/Kerberos PR "wins", I just want to make sure that the end result supports complete functionality that we at Viasat and other typical users might need. For example, if this PR inspires improvements to feat(sasl): add GSSAPI as authentication mechanism #563 to support multiple brokers and username+password as an alternative to keytab, I will be fine if feat(sasl): add GSSAPI as authentication mechanism #563 gets merged instead of this one.


I will post another comment in here later to elaborate on my reasons for some choices I made in this PR, some alternatives I considered, and so on.

@mentalisttraceur
Copy link

Posting this comment from my personal account to certify that I am indeed @viasat-akozhevnikov .

@viasat-akozhevnikov
Copy link
Author

viasat-akozhevnikov commented Feb 4, 2021

Regarding how I added support for multiple brokers:

I added a sasl.NeedsHost interface in sasl/sasl.go that SASL mechanisms can optionally implement, which the dialer.go and transport.go use if available to create a copy of the mechanism parameterized with the host.

I did it this way to maintain backwards compatibility as much as possible:

  1. Since public functions/structs in this library take sasl.Mechanism instances, and changing the signature/definition of those functions/structs would be a breaking change, I had to have the GSSAPI mechanism implement that interface.

  2. I considered changing the sasl.Mechanism interface's Start method to take a host string, but since that interface is public, I decided to treat it as if it was part of the official API: I avoided changing it because that there might be third-party code out there which implements that interface and relies on that interface being exactly as it is.

  3. But the existing sasl.Mechanism interface is inadequate for SASL mechanisms which need per-host or per-connection information, so I had to do something to inject additional information.

  4. Mutating the mechanism instance itself to set the host was a no-go, because mutating an existing mechanism would create race conditions (a user might reasonably assume they can reuse the same SASL mechanism between multiple connections started in separate goroutines at the same time, because it looks like plain static configuration).

  5. So I chose the approach which lends itself best to a clean creation of a host-parameterized copy of the GSSAPI SASL mechanism object: an optional interface with a single method that takes the host string as an argument, and returns another sasl.Mechanism.

Unfortunately, the combination of these constraints meant that in the GSSAPI mechanism, I had to make the Start method return an error if the host has not yet been set by a call to WithHost. Normally I don't like introducing possible errors-of-misuse like that (ideally an API is shaped to minimize those, especially the less obvious ones like "you had to call that other method before you can call this method"). But the good news is that the need to deal with this is localized to the implementation of the SASL authentication communication - in other words, so long as kafka-go uses it correctly, the user plugging the GSSAPI mechanism into kafka-go never has to know about it and will never trigger that error.

Comment on lines +47 to +48
func (m mechanism) WithHost(host string) sasl.Mechanism {
m.host = host
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am curious if these two lines are obvious enough?

The fact that this returns a copy of the GSSAPI mechanism is implicit in using a non-pointer method receiver, and the reason for it is also left implicit.

I think if I were reading this code it would be obvious, so I didn't want to patronizingly draw attention to it, but I am happy to make this more explicit if desired. For example, I can see some benefits to writing it like this instead:

Suggested change
func (m mechanism) WithHost(host string) sasl.Mechanism {
m.host = host
func (m *mechanism) WithHost(host string) sasl.Mechanism {
// Creating a copy avoids race conditions if the
// mechanism is reused for multiple connections:
copy := *m
copy.host = host

@achille-roussel achille-roussel self-assigned this Feb 5, 2021
@achille-roussel
Copy link
Contributor

Hello @viasat-akozhevnikov, thanks for the contribution 👍

I would be happy to merge the extensions you added to the sasl package, tho I'm a bit concerned about taking on the maintenance of sasl/gssapi as we don't have any use case for it within Segment, nor expertise maintaining this SASL mechanism.

Would you be open to hosting this package yourself and have us include a link to your repository in the documentation?

@mentalisttraceur
Copy link

mentalisttraceur commented Jan 19, 2022

By the way, the context-based approach in #725 ended up meeting our needs just as well as the WithHost interface in this PR.

Rebasing the sasl/gssapi piece to get the host from the context was trivial.

Sorry I never got back to you on this one. The proposal to split it apart and merge only the change to the sasl package was fine for us, we just never got around to changing this PR or opening a new one to have just that change by itself.

Ultimately I think it was maybe for the best. At the time when I wrote this I thought the .WithHost(...) interface was better than shoving it through the context, but now I think a good argument could be made that the context approach is better than what I did.

As for maintaining the sasl/gssapi part separately. That's in Viasat's hands now. There was some willingness to do it on my team, but we never got around to it, and I don't think I'm going to be involved with it anymore, so I can't say either way.

@robcowart
Copy link

This PR imports jcmturner/gokrb5. There are licensing issues with jcmturner/gokrb5 which is why we were looking for a Sarama replacement and evaluating segmentio/kafka-go. This PR would introduce the same problems.

While jcmturner/gokrb5 selected in its GitHub settings that it is licensed under Apache-2.0, this license has specific requirements for it to apply. You can see these requirements in GitHub if you open the LICENSE file page.

image

The important one to note here is “and copyright notice”. This requirement comes from the end of the license “How to apply the Apache License to your work”. It states…

To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "{}"
replaced with your own identifying information. (Don't include
the brackets!)  The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.

The boilerplate is one that we have all likely often seen, without considering its significance. Most importantly it is here where the copyright owner(s) of the code must be specified and where the application of the Apache 2.0 License is actually declared (those GitHub settings mean nothing).

Copyright {yyyy} {name of copyright owner}

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

No where in jcmturner/gokrb5 has the boilerplate or any other copyright claim or notice been included. Without this the Apache 2.0 License has never been applied, leaving the project in a license limbo. With no applicable license the code is not open source. While you can openly see it, the right to use or distribute it was thus never granted. This is particularly problematic for this project as it is imported by A LOT of other projects. Unfortunately it seems that jcmturner has been completely inactive since 2020, so a chance of a copyright notice being added appears unlikely.

I have opened an issue on jcmturner/gokrb5 here... jcmturner/gokrb5#461 . If the copyright notice is added to jcmturner/gokrb5 it won't affect this PR which imports it. Until then we would prefer not to have exposure if we use segmentio/kafka-go.

@mentalisttraceur
Copy link

Good to know and nice catch about the licensing details @robcowart.

By the way I'm no longer with Viasat so I can't update this PR or speak to their intentions with it.

@sdojjy
Copy link

sdojjy commented Dec 27, 2022

@viasat-akozhevnikov do you still working on this PR?

@mentalisttraceur
Copy link

@sdojjy nope. Per my last two comments on this issue:

the context-based approach in #725 ended up meeting our needs just as well [as this PR]

I'm no longer with Viasat so I can't update this PR

That leaves just the sasl/gssapi/gssapi.go piece, but as @achille-roussel said:

I'm a bit concerned about taking on the maintenance of sasl/gssapi as we don't have any use case for it within Segment, nor expertise maintaining this SASL mechanism.

@mentalisttraceur
Copy link

mentalisttraceur commented Dec 27, 2022

By the way, the legal technicality uncertainty with jcmturner/gokrb5 missing a copyright notice (brought up in an earlier comment on this issue) has been resolved upstream.

@JorTurFer
Copy link

JorTurFer commented Aug 2, 2023

Is there any update? In KEDA we would be interested on this feature. We are integrating kafka-go to replace sarama client and this feature'd be nice

@vasily-kirichenko
Copy link

Do I understand right that it could be merged as soon as the conflicts are solved (as long as the licensing issue seems to be resolved)? I could resolve the merge conflicts and make a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants