Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[image-builder] Backport mkIII from gitpod-com #4547

Merged
merged 15 commits into from
Jul 22, 2021
Merged

Conversation

csweichel
Copy link
Contributor

@csweichel csweichel commented Jun 18, 2021

This PR backports the image-builder mkIII from the gitpod-com repo. image-builder mkIII is the third rewrite of image-builder and works fundamentally different to is predecessors:

  • all image builds run as headless workspaces, which means that image-builds scale much nicer compared to mkII
  • no more Docker-in-Docker, but uses buildkit directly which makes for a simpler implementation and lighter dependencies
  • uses containerd's fetcher to resolve images

While its implementation is really different to previous iterations it maintains the same interface and is a drop-in replacement for image-builder mkII.

Note: because this is essentially a backport of previous work it comes in a single commit.

ToDo

  • Make sure this stuff even builds
  • Test image builds and make sure they even still work
  • Sanity check the implementation - so far I've just blindly copied over what I had written more than a year ago
  • Integrate feedback from [image-builder] Backport mkIII from gitpod-com #4251

Future Work

  • ws-manager selection is currently fixed in the configuration. This does not sit well with pluggable workspace cluster. Instead we should select the target workspace cluster by asking ws-manager-bridge (as should server). For now we'll keep running the image builds in the fat cluster.

Known defects

  • image builder mkIII uses the buildID as workspace instance ID, which means it might re-use the same instance ID if the previous build failed. ws-manager and ws-daemon don't cope too well with re-used instance ID's, e.g. the initializerMap in monitor.go

This PR supersedes #4251 and installs image-builder mk3 alongside the previous image builder to facilitate a smoother transition.

fixes #4812

@csweichel csweichel changed the title Cw/imgbuilder mk3 rebase [image-builder] Backport mkIII from gitpod-com Jun 18, 2021
@csweichel csweichel force-pushed the cw/imgbuilder-mk3-rebase branch 4 times, most recently from 5ff6573 to 2a208f4 Compare June 20, 2021 11:14
@codecov
Copy link

codecov bot commented Jun 20, 2021

Codecov Report

Merging #4547 (e520320) into main (035fa4e) will increase coverage by 36.48%.
The diff coverage is 21.73%.

Impacted file tree graph

@@            Coverage Diff            @@
##           main    #4547       +/-   ##
=========================================
+ Coverage      0   36.48%   +36.48%     
=========================================
  Files         0       13       +13     
  Lines         0     3725     +3725     
=========================================
+ Hits          0     1359     +1359     
- Misses        0     2250     +2250     
- Partials      0      116      +116     
Flag Coverage Δ
components-ws-manager-app 36.48% <21.73%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
components/ws-manager/pkg/manager/annotations.go 66.66% <ø> (ø)
components/ws-manager/pkg/manager/config.go 32.50% <ø> (ø)
components/ws-manager/pkg/manager/manager.go 24.83% <0.00%> (ø)
components/ws-manager/pkg/manager/monitor.go 0.00% <0.00%> (ø)
components/ws-manager/pkg/manager/create.go 78.79% <57.14%> (ø)
components/ws-manager/pkg/manager/status.go 71.57% <100.00%> (ø)
...omponents/ws-manager/pkg/manager/pod_controller.go 0.00% <0.00%> (ø)
components/ws-manager/pkg/manager/probe.go 0.00% <0.00%> (ø)
components/ws-manager/pkg/manager/imagespec.go 0.00% <0.00%> (ø)
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 035fa4e...e520320. Read the comment docs.

@csweichel csweichel force-pushed the cw/imgbuilder-mk3-rebase branch 7 times, most recently from 1c5efb8 to 45d0b25 Compare June 21, 2021 15:41
@csweichel
Copy link
Contributor Author

csweichel commented Jun 23, 2021

/werft run

👍 started the job as gitpod-build-cw-imgbuilder-mk3-rebase.25

@csweichel csweichel force-pushed the cw/imgbuilder-mk3-rebase branch 2 times, most recently from 53d3bb3 to f817376 Compare June 24, 2021 12:51
@csweichel csweichel force-pushed the cw/imgbuilder-mk3-rebase branch from f817376 to ced9939 Compare July 16, 2021 08:44
@csweichel csweichel force-pushed the cw/imgbuilder-mk3-rebase branch from ced9939 to 0f9f9a1 Compare July 16, 2021 09:43
@csweichel csweichel force-pushed the cw/imgbuilder-mk3-rebase branch 9 times, most recently from d7f2a63 to dc64c22 Compare July 18, 2021 12:28
@geropl
Copy link
Member

geropl commented Jul 21, 2021

I reviewed all code "around" the actually implementation and that looks good (see comments). Only issue I see is the one mentioned above, else lgtm!

@csweichel
Copy link
Contributor Author

csweichel commented Jul 21, 2021

/werft run with-retag

👍 started the job as gitpod-build-cw-imgbuilder-mk3-rebase.102

@csweichel
Copy link
Contributor Author

@geropl thank you for looking into this. I've fixed the procfd issue by downgrading buildkit to 0.8.3 again. We don't support mounting proc through fsopen yet, only through mount.

@csweichel csweichel force-pushed the cw/imgbuilder-mk3-rebase branch from 2ed33f5 to e520320 Compare July 21, 2021 20:31
@aledbf
Copy link
Member

aledbf commented Jul 21, 2021

using internal docker registry (EKS) I see this error:

#11 ERROR: failed to do request: Head https://registry.default.svc.cluster.local/v2/workspace-images/blobs/sha256:f5cfc5f3861ca68ff5460aa1004a9e2c901c752efcf98e4d8fd89578b5689e0a: dial tcp: lookup registry.default.svc.cluster.local on 8.8.8.8:53: no such host
------
 > exporting to image:
------
FATA[0086] build failed                                  error="cannot build Gitpod layer: failed to solve: failed to do request: Head https://registry.default.svc.cluster.local/v2/workspace-images/blobs/sha256:f5cfc5f3861ca68ff5460aa1004a9e2c901c752efcf98e4d8fd89578b5689e0a: dial tcp: lookup registry.default.svc.cluster.local on 8.8.8.8:53: no such host" serviceContext="{bob }"
exit
exit

@csweichel
Copy link
Contributor Author

using internal docker registry (EKS) I see this error:

#11 ERROR: failed to do request: Head https://registry.default.svc.cluster.local/v2/workspace-images/blobs/sha256:f5cfc5f3861ca68ff5460aa1004a9e2c901c752efcf98e4d8fd89578b5689e0a: dial tcp: lookup registry.default.svc.cluster.local on 8.8.8.8:53: no such host
------
 > exporting to image:
------
FATA[0086] build failed                                  error="cannot build Gitpod layer: failed to solve: failed to do request: Head https://registry.default.svc.cluster.local/v2/workspace-images/blobs/sha256:f5cfc5f3861ca68ff5460aa1004a9e2c901c752efcf98e4d8fd89578b5689e0a: dial tcp: lookup registry.default.svc.cluster.local on 8.8.8.8:53: no such host" serviceContext="{bob }"
exit
exit

We used to run the registry through the proxy (prior to the caddy migration). I reckon that's the way forward here, too.

@aledbf
Copy link
Member

aledbf commented Jul 22, 2021

/lgtm
/approve

@roboquat
Copy link
Contributor

LGTM label has been added.

Git tree hash: 4f9bc2fd8cbb030b8863fa4a732b58c65ad8a22d

@AlexTugarev
Copy link
Member

Changes in compontents/server LGTM!

/approve

@roboquat
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aledbf, AlexTugarev, csweichel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@roboquat roboquat merged commit 3bb79b0 into main Jul 22, 2021
@roboquat roboquat deleted the cw/imgbuilder-mk3-rebase branch July 22, 2021 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Merge image-builder mkIII
6 participants