Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting 'auth error: EOF' when referencing a GitRepository pulling via SSH #210

Closed
wolfmah opened this issue Aug 5, 2021 · 9 comments
Closed

Comments

@wolfmah
Copy link

wolfmah commented Aug 5, 2021

When creating an ImageUpdateAutomation that reference a GitRepository in HTTPS mode, ImageUpdateAutomation is able to reconcile. When referencing a GitRepository in SSH mode, it gives this error: auth error: EOF.

GitRepository in HTTPS mode:

apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: GitRepository
metadata:
  name: my-app-https
  namespace: flux-system
spec:
  gitImplementation: go-git
  ignore: |
    # exclude all
    /*
    # include kustomize dir
    !/kustomize
    # exclude file extensions from kustomize dir
    /kustomize/**/*.md
    /kustomize/**/*.txt
  interval: 15m0s
  ref:
    branch: develop
  secretRef:
    name: bitbucket-https-credentials
  timeout: 20s
  url: https://bitbucket.org/my-group/my-app

GitRepository in SSH mode:

apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: GitRepository
metadata:
  name: my-app-ssh
  namespace: flux-system
spec:
  gitImplementation: go-git
  ignore: |
    # exclude all
    /*
    # include kustomize dir
    !/kustomize
    # exclude file extensions from kustomize dir
    /kustomize/**/*.md
    /kustomize/**/*.txt
  interval: 15m0s
  ref:
    branch: develop
  secretRef:
    name: bitbucket-ssh-credentials
  timeout: 20s
  url: ssh://git@bitbucket.org/my-group/my-app

ImageUpdateAutomation (only spec.sourceRef.name changes while testing the two GitRepository):

apiVersion: image.toolkit.fluxcd.io/v1beta1
kind: ImageUpdateAutomation
metadata:
  name: my-app
  namespace: flux-system
spec:
  git:
    checkout:
      ref:
        branch: develop
    commit:
      author:
        email: flux@my-domain.com
        name: flux
      messageTemplate: |
        [ci skip] -
        {{range .Updated.Images}}{{println .}}{{end}}
    push:
      branch: develop
  interval: 15m0s
  sourceRef:
    apiVersion: source.toolkit.fluxcd.io/v1beta1
    kind: GitRepository
    name: my-app-https
  update:
    path: ./kustomize/dev
    strategy: Setters

The status result when spec.sourceRef.name == my-app-ssh:

status:
  conditions:
  - lastTransitionTime: "2021-08-05T11:13:23Z"
    message: 'auth error: EOF'
    reason: ReconciliationFailed
    status: "False"
    type: Ready
  lastAutomationRunTime: "2021-08-05T11:09:05Z"
  lastHandledReconcileAt: "2021-08-05T07:15:48.299246-04:00"
  observedGeneration: 3

The status result when spec.sourceRef.name == my-app-https:

status:
  conditions:
  - lastTransitionTime: "2021-08-05T11:16:58Z"
    message: no updates made
    reason: ReconciliationSucceeded
    status: "True"
    type: Ready
  lastAutomationRunTime: "2021-08-05T11:17:28Z"
  lastHandledReconcileAt: "2021-08-05T07:16:57.526803-04:00"
  observedGeneration: 4
@kingdonb
Copy link
Member

kingdonb commented Aug 6, 2021

Are you certain the SSH key used is a read-write one? Bitbucket "Access Keys" analog to "Deploy Keys" from GitHub and friends, are read-only without any read-write option on Bitbucket Cloud.

ImageUpdateAutomation writes to the repo, so it needs a read-write key. I am not certain how this error presents from Bitbucket on push when a key with read-only access is used, so I would be suspicious of any auth error from Bitbucket as potentially due to this issue.

(You could add your SSH key to a machine account with write access to work around this issue, if that is the case.)

@wolfmah
Copy link
Author

wolfmah commented Aug 6, 2021

To make it more explicit, yes, this project in on Bitbucket, as subtly hinted at in the manifests.

Bitbucket "Access Keys" [...] are read-only
We don't use Access Keys from the repository settings page. The way our setup is is as follow.

We have a full-fledged Flux bot user. On that user, we have setup ssh keys to access our multiple k8s clusters. The same way that my own user have a ssh key for working with git/Bitbucket.

Then, the access to the repositories are managed on a repo basis: each repository decide which users/groups have access, and at what level (read/write/admin).

On the relevant repos that we want Flux to automate, we setup the bot user with write access. I even tried with admin, but it didn't help. If the GitRepository is working over ssh, ImageUpdateAutomation won't be able to push commits.

To test it all out, we made a setup where we used an App Password (with read/write permission for repositories). By setting the GitRepository with those credentials, and the appropriate url, ImageUpdateAutomation is able to push commits.

The thing is, those are not brand new ssh keys or a brand new setup for accessing our repos. Those were setup and used by our Flux v1 integration (which had auto image update enabled). :/

@kingdonb
Copy link
Member

kingdonb commented Aug 9, 2021

Hmm! So you are able to work around with an app password, but this uses HTTP auth and this is probably undesirable

(Moreover, if there is a bug in the SSH implementation, you will not be the only users affected and we will want to have a bead on the issue.)

I have a Flux cluster set up on Bitbucket cloud, but I have not tested ImageUpdateAutomation there with SSH. I will get around to it within the next couple of days, then perhaps we can debug the issue on our own without taking up your time.

Thank you for the report! Can you please confirm what version of Flux are you using, and re-confirm if it is anything earlier than 0.16.2 whether the problem is still in force if you upgrade to the last Flux version?

@squaremo
Copy link
Member

I did this

  • create a new project in my account on bitbucket.org
  • use ssh-keygen to create a new SSH key
  • paste the public key into my account's SSH keys (treating myself as a bot)
  • flux bootstrap [...] --private-key-file=id_rsa in a freshly cluster, to install flux and use the key I just created
  • make an ImageRepository and ImagePolicy for the source-controller image, and an ImageUpdateAutomation object pointing at the git repo
  • put an update marker in flux-system/gotk-components.yaml referring to the policy, commit and push

The last few steps are just a minimal way to get the automation to make commits -- I go and edit the image version in the file to an old one and commit that, it updates to the latest.

Then I tried different kinds of key by recreating the secret with

flux create secret git --url ... --ssh-key-algorithm ... flux-system

(and pasting the fingerprint into bitbucket).

I found that it worked both with an RSA key and with EDRSA (default number of bits / curve). With an ED25519 key, image-automation-controller fails to clone the repo with the error message Failed to authenticate SSH session: Unable to extract public key from private key. I think this is a limitation of libgit2, at the present anyway.

@wolfmah Can you see any significant differences between what I've done and your setup?

@wolfmah
Copy link
Author

wolfmah commented Sep 30, 2021

@kingdonb
I just tried the same step as I described above, but with the latest 0.17.2. Same result: auth error: EOF.

Can you see any significant differences between what I've done and your setup?
@squaremo

Here's what was already in place:

  • A private corporate Bitbucket Workspace that I (and our Flux bot) have access to.
  • A test application that our team use as a sandbox. This private repository belongs to the aforementioned workspace.
  • The Flux bot have already SSH keys created for all our environments (in RSA format). As of now, they are used by Flux v1.

Now, for the new stuff:

  • Flux is installed via Terraform. We export the installation manifests via: flux install --export ..., copy the output inside a module then use a Kustomize provider to patch some stuff (like requests/limits of controllers, annotations/labels, etc.).
  • This Terraform module also creates the Kubernetes secrets: bitbucket_https_credentials and bitbucket_ssh_credentials.

The rest is pretty much the same: create ImageRepository + ImagePolicy, along with the GitRepository + Kustomization. Then the ImageUpdateAutomation, pointing to the GitRepository and either the HTTPS or SSH secret. Then, pushing some commit and letting our CI generate a new Docker artifact that will eventually get picked up by the ImagePolicy.

It feels like I'm missing something, because squaremo, if you are able to work with Bitbucket and SSH, I should be able too. :/

@squaremo
Copy link
Member

squaremo commented Oct 5, 2021

Making the SSH secret with Terraform is a difference -- are you able to try using flux create secret git ... to create the SSH secret? That would rule it out, at least.

@hiddeco
Copy link
Member

hiddeco commented Oct 9, 2021

The latest release of the image-automation-controller (v0.15.0) contains libgit2 linked against OpenSSL and LibSSH2, which based on my research and extensive testing, should solve most issues around private key formats.

NB: if the issue continues to exist, I think it is happening due to a handshake issue and you may want to confirm the key in known_hosts isn't of an ECDSA* type. We will support these types soon as well, but need to get the libgit2 >=1.2.0 in a working shape first as support was added for this in 1.2.0.

@wolfmah
Copy link
Author

wolfmah commented Oct 12, 2021

I just tested in 0.18.2 and the same problem happen.

But...

Making the SSH secret with Terraform is a difference

It is indeed a difference that I didn't account for before. I don't know how the old keys were generated, but I can see that they are RSA keys. Though, for testing, I personally created a new key-pair and used flux create secret git ... to upload it to the cluster.

ssh-keygen -b 3072 -t rsa -f ~/.ssh/bitbucket_flux-test_id_rsa -q -N ""

flux --context=my-cluster -n flux-system create secret git bitbucket-ssh-credentials-test --url=ssh://git@bitbucket.org/my-group/my-app  --private-key-file=/my-full-home-path/.ssh/bitbucket_flux-test_id_rsa

Using that new secret inside my already existing GitRepository is making the ImageUpdateAutomation function properly. Thanks for pointing it out @squaremo , now I know the old keys are bonkers! Either them specifically, or when they are uploaded to the cluster via Terraform (I'm betting on the latter one).

@wolfmah wolfmah closed this as completed Oct 12, 2021
@squaremo
Copy link
Member

Thanks for reporting back! 👍

Either them specifically, or when they are uploaded to the cluster via Terraform (I'm betting on the latter one).

It may just be that the fields in .data get different names when you create them with Terraform -- if they were originally made for Flux v1, then this is probably so. (If you are able to check, it would be good to verify this).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants