Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

my template seems slower with the Terraform provider, difficult to troubleshoot #39

Closed
bpmct opened this issue Aug 10, 2024 · 12 comments
Closed
Assignees
Labels
project/envbuilder question Further information is requested

Comments

@bpmct
Copy link
Member

bpmct commented Aug 10, 2024

Hey folks, let me know if I'm doing something wrong here. I just started using the provider and have to say, the UX is awesome.

However, I added the envbuilder provider + cached image support to this template and workspaces are taking ~1 additional minute to start. The envbuilder_cached_image build step is relatively snappy (26s) , but I notice that the workspace agent doesn't connect for around 80 seconds or so. During that time, there are no logs in the dashboard.

I'm not able to see the workspace logs in grafana.dev.coder.com or inspect the registry to see if it is in fact being pushed versus built. Any ideas?

With cached image

https://www.loom.com/share/2e9e3b1d90104ef8b1e6545422220eae

Without cached image

https://www.loom.com/share/f8419923b75b42dcbde8cadbb1772ef1

Interestingly, there are no envbuilder logs in the Dashboard for either

@coder-labeler coder-labeler bot added project/envbuilder question Further information is requested labels Aug 10, 2024
@bpmct
Copy link
Member Author

bpmct commented Aug 10, 2024

Screenshot 2024-08-10 at 2 52 43 PM

i added metadata and it looks like it's using the envbuilder image every time, not the cached image. however, i do not see any build logs or anything

@bpmct
Copy link
Member Author

bpmct commented Aug 10, 2024

Switched to ghcr.io/coder/envbuilder-preview:1.0.0-rc.5-dev-3f054f6 still not seeing logs :(

@bpmct
Copy link
Member Author

bpmct commented Aug 10, 2024

@johnstcn fyi for Monday

@johnstcn johnstcn self-assigned this Aug 12, 2024
@johnstcn
Copy link
Member

johnstcn commented Aug 12, 2024

I was able to reproduce this. It looks like there's a race condition between the workspace build completing and Envbuilder sending logs to Coder:

# Docker container created at 08:54:42
        "Created": "2024-08-12T08:54:42.276899979Z",

# Attempt to init LogSender failed at 08:54:53 (approx. 10 seconds after start)
2024-08-12T08:54:53.371122643Z unable to send logs to Coder: init coder rpc client: GET https://dev.coder.com/api/v2/workspaceagents/me/rpc?version=2.0: unexpected status code 401: Workspace agent not authorized.: Try logging in using 'coder login'.
2024-08-12T08:54:53.371195329Z  Error: The agent cannot authenticate until the workspace provision job has been completed. If the job is no longer running, this agent is invalid.

# ProvisionerJob completed at 08:54:55
{"ts":"2024-08-12T08:54:55.844589307Z","level":"DEBUG","msg":"sent CompletedJob","caller":"/home/runner/work/coder/coder/provisionerd/runner/runner.go:229","func":"github.com/coder/coder/v2/provisionerd/runner.(*Runner).Run","logger_names":["runner"],"fields":{"job_id":"a560fbec-682f-4743-9e0c-d31dd4952613","template_name":"data-takehome","template_version":"eloquent_satoshi1","workspace_build_id":"a56516a0-b783-469e-8576-45e52b8543db","workspace_id":"6610e021-ddb6-4eed-9398-df044e6bd709","workspace_name":"blue-shark-17","workspace_owner":"cian","workspace_transition":"start"}}

It should retry but this appears to not be working correctly. Will push a fix.
EDIT: retries do work correctly but the retry ceiling was not high enough. Bumped to 30s in coder/envbuilder#313

@johnstcn
Copy link
Member

@bpmct I created a new template version using the updated Envbuilder image and this has worked around the issue for now.

https://dev.coder.com/templates/coder/data-takehome/versions/lucid_pascal7

I haven't promoted this to be the default version yet.

@bpmct
Copy link
Member Author

bpmct commented Aug 12, 2024

Thanks Cian, trying it now.

@bpmct
Copy link
Member Author

bpmct commented Aug 13, 2024

Gave it a shot and logs work! I'm noticing it's still uniquely building the image each time despite it "pushing the image" at the end

#2: 🏗️ Built image! [1m20.859283746s]
#3: 🏗️ Pushing image...
#3: Pushing image to us-central1-docker.pkg.dev/coder-dogfood-v2/envbuilder-cache/coder-dogfood
#3: Pushed us-central1-docker.pkg.dev/coder-dogfood-v2/envbuilder-cache/coder-dogfood@sha256:62f16c541a7e71b110bf914e4c0a205e5142e52d36e181f5d8a199aeadf032ae
#3: 🏗️ Pushed image! [2.088515236s]

Any idea why it's not finding the registry image?

@johnstcn
Copy link
Member

johnstcn commented Aug 13, 2024

I added some diagnostics in the provider (v0.0.3) and I can now see that it's due to a bug relating to how we compile the Dockerfile from a devcontainer.json:

Error: compile devcontainer.json: open dockerfile: mkdir /.envbuilder: read-only file system

While this will need to be fixed in envbuilder, this doesn't affect envbuilder itself because it runs as root inside a container.

@johnstcn
Copy link
Member

One other thing I should point out here is that devcontainer features are currently not cached.
I believe coder/envbuilder#210 is one of the issues relating to this.

@bpmct
Copy link
Member Author

bpmct commented Aug 16, 2024

So if I have devcontainer features, it'll do a fresh build every time?

@johnstcn
Copy link
Member

So if I have devcontainer features, it'll do a fresh build every time?

Currently yes, although not 100% fresh as it will still have some cached layers from the repo cache to draw upon.

@johnstcn
Copy link
Member

Closing this out for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
project/envbuilder question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants