Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netbird can't query users when using newer versions than Zitadel 2.61.0 #2616

Closed
Kidswiss opened this issue Sep 18, 2024 · 12 comments
Closed

Comments

@Kidswiss
Copy link

Describe the problem

When updating Zitadel to 2.61.2 or anything newer, then Netbird can't query the Zitadel user endpoint anymore.

To Reproduce

Steps to reproduce the behavior:

  1. Install Netbird v0.29.3
  2. Install Zitdadel 2.61.1 or newer
  3. Get 403 during login

Expected behavior

Zitadel integration should still work if it gets updated.

Are you using NetBird Cloud?

Selfhosted

NetBird version

0.29.3

Additional context

Add any other context about the problem here.

Netbird management logs

2024-09-18T14:31:43Z WARN [context: SYSTEM] management/server/account.go:1017: failed warming up cache due to error: unable to post https://idp.secret.ch/management/v1/users/_search, statusCode 403

Zitadel log entries:

time="2024-09-18T14:36:52Z" level=warning msg="token verifier repo: decrypt access token" caller="/home/runner/work/zitadel/zitadel/internal/authz/repository/eventsourcing/eventstore/token_verifier.go:283" error="ID=APP-ASdgg Message=invalid token"

I've tried re-creating the service account secret, but the error persisted. Also, not sure if this is an issue on Zitadel's side or on Netbird. But given that Netbird is the only app I had issues with, I opened a bug here.

@allroundtechie
Copy link

allroundtechie commented Sep 18, 2024

Can confirm the same issue with Zitadel 2.62.1 and Netbird 0.29.3
Additional logs from netbird-management container:
ERRO [requestID: 098374cd-f244-4be6-91f4-9b3e02fb292f, context: HTTP] management/server/http/util/util.go:81: got a handler error: token invalid
ERRO [context: HTTP, requestID: 098374cd-f244-4be6-91f4-9b3e02fb292f] management/server/http/middleware/auth_middleware.go:89: Error when validating JWT claims: unable to post https://bla.blabla.com/management/v1/users/_search, statusCode 403

The logs in the Zitadel container are identical like above.

It worked before months and several version (combinations) of Netbird and Zitadel. I am usually quite fast with updates and had no issues so far until the last update of Netbird and Zitadel. So I guess something has changed either in Netbird or Zitadel in the last 1-2 releases which is the root cause of this issue.

@bcmmbaga
Copy link
Contributor

I see that Zitadel released v2.62.1 two days ago, but they have now marked v2.59.3 as the latest version. Could you try using v2.59.3 (latest) for now or rollback to the previous version that was working for you?

In meantime we will run tests to confirm the breaking changes and update the NetBird Zitadel implementation accordingly.

@allroundtechie
Copy link

allroundtechie commented Sep 19, 2024

I see that Zitadel released v2.62.1 two days ago, but they have now marked v2.59.3 as the latest version. Could you try using v2.59.3 (latest) for now or rollback to the previous version that was working for you?

In meantime we will run tests to confirm the breaking changes and update the NetBird Zitadel implementation accordingly.

This is for sure some mistake by Zitadel tagging this version 2.59.3 as "latest".
See https://github.com/zitadel/zitadel/releases
They have several versions updated in the last days with all these three bug fixes mentioned (from 2.54.x to 2.62.x).

@adasauce
Copy link
Contributor

adasauce commented Sep 19, 2024

I just wanted to follow up with both a "me too" and some info from the zitadel side. the events history does say a token was created and authenticated properly for me. so it appears to be some kind of permission issue just with the netbird user accessing that endpoint.

This was all working previously for many months.

I have some experience writing integrations with zitadel, I'll poke around to see what netbird is calling vs. what the api is expecting.

edit:

I added some extra logging and error response parsing into the management server and zitadel is responding with:

failed warming up cache due to error: zitadel error code: 7 message: could not read projectid by clientid (AUTH-GHpw2)

will continue poking around

edit2:

so it looks like the client id we're using to authenticate "netbird" by the docs, + the client secret are getting encoded into the JWT returned from zitadel. and we're using that client id "netbird" to make requests.

zitadel on the on the otherhand is doing some work to verify the access token and they're looking up the client_id from the access token we pass in. they're looking up that client_id in the registered apps list to see which app and project it should belong to. but "netbird" isn't the client id of the app, it's 234872394...@netbird.

however if we use that client id to perform the management query, they're logging this error:

oidc_error.parent="ID=QUERY-Dfbg2 Message=Errors.User.NotFound Parent=(sql: no rows in result set)" oidc_error.description="client not found" oidc_error.type=invalid_client status_code=400

there's definitely some confusion happening on what credentials should be used

@adasauce
Copy link
Contributor

another follow-up:

I added a PAT for the netbird user and made changes to the management service overloading the ClientSecret and Authenticate method to just make a pretend JWT with the AccessToken being the PAT to use that instead of authenticating a JWT and everything seems to be working fine this way since it just concatenates Bearer + accessToken to assemble it before a request is made.

I think it would be a relatively simple change to just use a PAT and refactor the config a bit if we want to swerve this issue. I'll keep tweaking configurations and hacking on both sides to see if I can find the real cause though.

In the meantime at least my management service is back online :)

@adasauce
Copy link
Contributor

more extra data:

I added support in netbird for using the Bearer "Access Token Type" instead of JWT from zitadel as well, and get the same could not read projectid by clientid error as before. So it's not to do with receiving and passing the jwt access token.

I also tried adding the urn:zitadel:iam:org:project:id:{projectid}:aud scope to the scopes when making the access token request as noted here: https://zitadel.com/docs/guides/integrate/service-users/client-credentials#2-authenticating-a-service-user-and-request-a-token but that also didn't make a difference.

@adasauce
Copy link
Contributor

adasauce commented Sep 20, 2024

I'm getting another chance to look at this today and at this point I'm pretty sure there's some undesired behaviour going on the zitadel side here. I've followed all of the specs A-Z to build this token for a service user from their docs and their examples, but none of them will authenticate.

I think it may have been introduced in a big refactor on their side at 8e0c8393. If I make a small change to the auth flow in zitadel and not assume any client_id request a project request, only checking clientid against projectid when it's of the @ format and continuing on otherwise with the rest of the auth flow, everything works again. I'm going to open up an issue on the zitadel side and see if I can learn some more there.

edit: though there's not much talk on their github issues list about this, I found some folks complaining in discord about service accounts not working with the same error.

@adasauce
Copy link
Contributor

zitadel/terraform-provider-zitadel#199 I'm seeing the issue pop up in some other places as well. linking for posterity.

@alexcupertme
Copy link

Can agree issue exists on Zitadel v2.62.1
Downgraded to v2.61.1 and all works perfectly, even after i rebooted Netbird and Zitadel, tried to catch this issue

@allroundtechie
Copy link

Can agree issue exists on Zitadel v2.62.1 Downgraded to v2.61.1 and all works perfectly, even after i rebooted Netbird and Zitadel, tried to catch this issue

Thanks for the hint. Can confirm 2.61.1 gets the access to the dashboard working again.

@Kidswiss
Copy link
Author

Kidswiss commented Oct 2, 2024

I've just tested against Zitadel's latest release, it's working now for me.

https://github.com/zitadel/zitadel/releases/tag/v2.62.4

@allroundtechie
Copy link

Can confirm the issue is gone with Zitadel 2.63.1, too.

@mgarces mgarces closed this as completed Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants