Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kola: build kolet statically #1896

Merged
merged 1 commit into from
Dec 1, 2020
Merged

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented Nov 23, 2020

Right now, we're building kolet in the cosa buildroot and then copying
it into the target system. This worked for a while by pure chance,
because it's really easy to hit a library mismatch between the cosa and
target environments. And now we do hit this case when using f33 to build
RHCOS due to symbol versioning in glibc.

The proper way to fix this would be for cosa to build kolet in a
buildroot that matches the target OS. This is resolvable, though it
would significantly affect the developer experience because devs would
now have to also download and maintain the target buildroot.

Target-specific buildroots are still something valuable to have (because
for dev and CI you do want a matching buildroot when hacking on host
components) but purely for cosa given that kolet is the only compiled
code it needs to run directly on the host, it's a very costly approach.

This patch instead addresses this simply by building kolet statically.
This has its own set of trade-offs (see discussions in
#1896), though should
work fine for the time being.

(In a previous version, this patch copied the libraries into the node
and used rpath to have kolet pick them up. Another approach I considered
was running kolet in a container, though (1) we would need to implement
some image caching to make it less expensive and (2) it would mean
having our test harness rely on the container runtime itself which seems
circular.)

And of course, we can eventually move to using buildroots in the future
once we flesh out the UX better there and the benefits outweigh the
costs.

This allows moving cosa to f33 without breaking RHCOS.

Partially addresses: #1863

@cgwalters
Copy link
Member

Given that golang bundles everything anyway, all we really need to copy is what the golang runtime needs, which is glibc and libpthreads.

Hmm...I think the reason this isn't generally used is it's not much better than fully statically linking. The reason here is that glibc expects to be able to dynamically load e.g. nss plugins, which you aren't copying here.

So I think I'd probably vote for static linking over this.

@jlebon
Copy link
Member Author

jlebon commented Nov 23, 2020

So I think I'd probably vote for static linking over this.

The thing is making it truly static is not trivial because the golang compiler makes it annoyingly difficult to set up (see golang/go#26492). I tried a couple of variations from that thread, but couldn't get it working. I can keep trying, and I'm sure we could get it to work, though ISTM like it would actually be more fragile than what we're doing here.

@cgwalters
Copy link
Member

Oh hmm I thought it was easy. (I think where things are settling with this is that people doing static end up using musl, at least that's what Rust is doing)

Anyways...we can certainly try rolling with this then,
/approve

@ashcrow
Copy link
Member

ashcrow commented Nov 24, 2020

    --- FAIL: fcos.upgrade.basic/upgrade-from-previous (122.08s)
            basic.go:294: failed waiting for machine reboot: timed out after 2m0s waiting for machine to reboot

@jlebon
Copy link
Member Author

jlebon commented Nov 24, 2020

continuous-integration/jenkins/pr-merge — This commit looks good

🎉

@jlebon
Copy link
Member Author

jlebon commented Nov 30, 2020

Let's get this in? This will unblock bumping cosa to f33.

@jlebon jlebon mentioned this pull request Nov 30, 2020
@darkmuggle
Copy link
Contributor

I'm late to the party, but I'm a bigger fan of statically linking. Here's the static variant:

ben@porpentina:~/src/cosa/mantle/cmd/kolet $ go build . 
ben@porpentina:~/src/cosa/mantle/cmd/kolet $ ls
kolet  kolet.go

ben@porpentina:~/src/cosa/mantle/cmd/kolet $ file kolet
kolet: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, Go BuildID=2Eg2c3KY8EV1xLSEz3G1/hJLSC7jjY5lMJimzBSA0/vKXm6I26mEf9Aopi-aJD/FyM1mQ3dXcAs72_y-X7Q, not stripped
ben@porpentina:~/src/cosa/mantle/cmd/kolet $ go build -tags  osusergo,netgo -ldflags="-extldflags=-static" -o kolet.static .
 
ben@porpentina:~/src/cosa/mantle/cmd/kolet $ file kolet.static 
kolet.static: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=Aw9fWopdy_xfL7ob-wTe/QfcWze1t9GWFxyMH-C7v/vKXm6I26mEf9Aopi-aJD/lgKQjS3I7YFx8Nh14oOW, not stripped

@jlebon
Copy link
Member Author

jlebon commented Nov 30, 2020

I'm late to the party, but I'm a bigger fan of statically linking. Here's the static variant:

My understanding is that doing this will not statically link glibc, but will make the executable use pure Go implementations of the bits that it normally needs from it (such as networking and passwd lookups). Which... is probably fine for what kolet actually does, but I can imagine subtle bugs in the future which hinge on that discrepancy.

What I was trying before was to have it link with libc.a from glibc-static, but couldn't get that to work (didn't spend too long on it either though :) ).

I'm still concerned about how fragile this is based on discussions in golang/go#26492. The insidious issue I have with it is that even if you pass -static, it might still choose to give you a dynamically-linked binary instead of erroring out.

But anyway, I recognize static linking is cleaner from an implementation POV to what this PR is doing. So I'm fine giving it a shot; will update.

Right now, we're building kolet in the cosa buildroot and then copying
it into the target system. This worked for a while by pure chance,
because it's really easy to hit a library mismatch between the cosa and
target environments. And now we do hit this case when using f33 to build
RHCOS due to symbol versioning in glibc.

The proper way to fix this would be for cosa to build kolet in a
buildroot that matches the target OS. This is resolvable, though it
would significantly affect the developer experience because devs would
now have to also download and maintain the target buildroot.

Target-specific buildroots are still something valuable to have (because
for dev and CI you do want a matching buildroot when hacking on host
components) but purely for cosa given that kolet is the only compiled
code it needs to run directly on the host, it's a very costly approach.

This patch instead addresses this simply by building kolet statically.
This has its own set of trade-offs (see discussions in
coreos#1896), though should
work fine for the time being.

(In a previous version, this patch copied the libraries into the node
and used rpath to have kolet pick them up. Another approach I considered
was running kolet in a container, though (1) we would need to implement
some image caching to make it less expensive and (2) it would mean
having our test harness rely on the container runtime itself which seems
circular.)

And of course, we can eventually move to using buildroots in the future
once we flesh out the UX better there and the benefits outweigh the
costs.

This allows moving cosa to f33 without breaking RHCOS.

Partially addresses: coreos#1863
@jlebon jlebon changed the title kola: run kolet with cosa libs kola: build kolet statically Nov 30, 2020
@jlebon
Copy link
Member Author

jlebon commented Nov 30, 2020

OK, updated this!

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ashcrow, cgwalters, jlebon, travier

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [ashcrow,cgwalters,jlebon,travier]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 1349684 into coreos:master Dec 1, 2020
@jlebon jlebon deleted the pr/kolet branch April 22, 2023 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants