-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Code running in Cloud Build gets stuck trying to fetch metadata in v5.1.0 #548
Comments
@theneva thank you for the detailed write up, I've added this to the backlog and we will dig into it soon. |
Status update: still investigating this issue. Today, many customers enjoy this behavior for workloads where it may take time for the metadata server to spin-up, however I can see how this can be an issue for certain Cloud Build configurations. I'll post another update on this issue within the coming days. |
Hey @theneva, Could you send over reproduction steps and/or a bit more information about your FROM node:16.19.0-alpine3.17
WORKDIR /usr/src/app
RUN npm init -y
RUN npm i @google-cloud/storage
RUN npm ls gcp-metadata
RUN node -p "const {Storage} = require('@google-cloud/storage'); new Storage().getBuckets().then(console.log, console.error)"
CMD [ "node", "-p", "true" ] Submitted via: gcloud builds submit --tag gcr.io/`gcloud config get-value project`/test |
Hi @danielbankhead! Edit: I added a step-by-step guide to run the reproduction at the bottom. Your example works for me too. It seems like when you just submit a Docker image to build, Cloud Build injects some credentials (presumably the build's service account). I think this prevents gcp-metadata from reaching the code path that triggers my issue. The issue still occurs for me If I run a Docker container inside a Cloud Build step without giving my container any kind of credentials. I put together a tiny reproduction at https://github.com/theneva/gcp-metadata-repro. It builds a Docker image with code that closely resembles your example, and then attempts to run that image without any credentials. I expect the image to log that it couldn't list the buckets because it has no credentials (from the onrejected function), and then exit cleanly (with code 0), letting the build succeed. The repro doesn't push any images, so you can test it with At the initial commit (https://github.com/theneva/gcp-metadata-repro/tree/48856f35e7f7bd9e7ec74642c22b3adfb8755761) the build consistently times out at step 1 (where I run the container). At the latest commit (https://github.com/theneva/gcp-metadata-repro/tree/762d3b23bdcc5a409990fca467b8d31cd0b0f251) I've pinned gcp-metadata to version 5.0.1 in the Dockerfile. Here, the build consistently completes after ~49 seconds, with the expected result: So, to repro:
Please let me know if you need anything else from me to look into this! |
Thanks! This repo helps a lot - adding
Which throws the In the latest releases we remove that timeout as it could take seconds to minutes for certain GCP workloads to connect to the metadata server. Additionally, it's detecting it's running on Google Cloud Linux rather than it's MAC address - this can be verified by checking I'll work on a few strategies to remedy this for you, hopefully without any config changes. |
Additional note: Setting Leaving this here in case you want to pass creds in the build, however we shouldn't hang when no creds are available. |
Aha! That makes sense. I guess there isn't too much I can do about it detecting the operating system. Setting As for the remedy, I think either of these suggestions would solve my problem (without requiring any configuration or unexpected-to-me behvaiour):
I'm sure I don't see the full picture, but those are two things I would expect the library to do instead the current behaviour 😄 Thanks a lot for checking out this issue – I'm looking forward to the solution! |
Hey @theneva, thanks for your patience; we have alignment internally on this and I'll prepare a PR shortly. |
Any news here? |
@SimenB we have a few items to tackle before resolving this. I’m estimating a PR should be ready and merged within 1-1.5 months. |
Prepared a PR: #562 We've landed on creating a new environment variable, While it isn't as seamless as automatically determining the GCE environment (which is very complicated to do across environments), it does present an opportunity to make it 'fail fast' via the |
5.3.0 with |
Hi! 👋
First things first:
Problem
I run builds on Cloud Build, which does roughly the following:
a. Start the newly built Docker image
b. Start a different image running Cypress with a bunch of UI tests
After upgrading from 5.0.1 to 5.1.0, step 3a started failing: The application launches, but gets immediately stuck when various Google client libraries try fetching data on startup.
This only happens on Cloud Build, and I believe the culprit to be the MAC address check introduced in #528 (though I have not done a thorough investigation). Running the exact same Docker image locally works fine.
I was pretty surprised when this changed, given that my code runs inside its own Docker container – but I suppose my container shares the host's network interface(s), so there's some blurring of the container boundaries there.
Environment details
europe-west1
region (the default regional pool, not a private pool) withoptions.machineType: E2_HIGHCPU_32
v16.19.0
(The Docker image is based onnode:16.19.0-alpine3.17
)8.19.3
(I use Yarn3.3.0
with the node-modules nodeLinker – either way, I believe this to be irrelevant)gcp-metadata
version:5.1.0
(5.0.1
works)Steps to reproduce
(I believe this can be reproduced by running any code that triggers metadata fetching inside Cloud Build without any configuration that extends or short-circuits the GCE residency detection, but I haven't tested it.)
Let me know if you need additional details!
The text was updated successfully, but these errors were encountered: