-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STREAM-437 - Add serverless-plugin-aws-alerts as npm dependency #30
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can approve code-wise, though please note that
(a) the build is failing and
(b) we should make sure this works together with the https://github.com/sleepio/flows-cli/pull/33 changes prior to merging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though the build in dev is failing.
npm notice
--
582 | npm notice New major version of npm available! 7.0.14 -> 8.6.0
583 | npm notice Changelog: <https://github.com/npm/cli/releases/tag/v8.6.0>
584 | npm notice Run `npm install -g npm@8.6.0` to update!
585 | npm notice
586 | npm ERR! code ERR_SOCKET_TIMEOUT
587 | npm ERR! errno ERR_SOCKET_TIMEOUT
588 | npm ERR! request to https://registry.npmjs.org/file-type/-/file-type-4.4.0.tgz failed, reason: Socket timeout
589 |
590 | npm ERR! A complete log of this run can be found in:
591 | npm ERR! /root/.npm/_logs/2022-04-06T21_20_03_170Z-debug.log
592 | The command '/bin/sh -c npm install -g serverless@${SERVERLESS_VERSION} --ignore-scripts spawn-sync' returned a non-zero code: 1
I do notice that all the builds in dev have been failing with the same error. Interestingly, building the images locally with the Is this a kind of timeout error that only happens in the dev environment? It's still a concern given that these appear repeatable and deterministic. I'll try to see if I can get this fixed, but let me know if you have seen this issue before and, if so, how it was resolved then. |
I haven't seen this specific issue before, and when I do a |
@kerr-bighealth The failure happens on the I can't download the artifacts to look at the log you're mentioning, the artifacts link doesn't show up any file? |
Do you know how to |
I would try doing a build locally to reproduce |
This is not occurring locally at all? Trying another env would give us maybe a bit more signal on "it's a general issue in CodeBuild", vs. "it's only in dev". @taro-bh I would start with trying to do a
Typically in these scenarios I will find myself doing single line edits in the scripts for a job in order to debug > commit > push. It's just kind of how these remote based platforms work. |
@kerr-bighealth @nmguse-bighealth As far as locally building the images, it's consistently successful for me and I see no network timeout issues (I do see all the same deprecation warnings, but not timeout/network connectivity issues). That's why I think I have to get on to the build container for investigating build environment specific issues.
In order to
In stage, I cannot initiate the build if I check the "Enable session connection" checkbox. EDIT: Potentially related ticket: npm/cli#3078 |
@taro-bh Add the |
UPDATE: It appears we are bitten by an ongoing issue with the
The infinite retries may be a very bad taste, but the install usually succeeds on the second retry, so this may be a good enough work around until the |
@taro-bh So I suppose the new setting you added Are we sure we want infinite retries instead of something which can fail faster in the case where it is not a Socket timeout type of issue? We would run the risk of obscuring other errors, flooding logs and having long-running builds for other types of issues. Perhaps set a fixed number of retries, e.g. 5? |
Apparently it's happening in a different layer than what we could control with the I've removed infinite loop and instead provided max retry as a configurable parameter. |
Dockerfile
Outdated
|
||
COPY . /var | ||
|
||
RUN cd /var && npm install | ||
# See the note on `npm install` above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to write all this code twice, could we instead extract it into a Bash script? Since a multi-line Docker RUN command of this level of complexity is rather tricky to manage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to spent that much energy to provide what is at best a patch solution to an issue that actually needs to be solved upstream? As have been noted, this is not at all a permanent solution, and as soon as the npm
package maintainers figure out how to properly fix the issue, we should simply remove the retry loop.
I'm not sure if keeping alive this blocker just for aesthetic reasons is necessary. We are already having to push all the blocking tickets to the next Sprint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- the issue PR you linked seems closed
- duplication of logic & maintainability aren't really aesthetic reasons?
However, I'm not the owner of this repo so I'm just giving my advice here. As I'm also the only one weighing in on this, and the code review approvals are already in place, you can proceed with these changes if you're happy and they're working
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have mentioned earlier that the npm
ticket is marked closed but the issue really hasn't been solved. Hence we are having to implement a workaround on our side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm just wondering how "temporary" this workaround really is given a closed Github issue.
Should we also reference npm/cli#4028 perhaps? To ensure the next person who comes along this code can find the information they need easily in order to potentially remove the workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think npm/cli#4028 is a related issue at all? Connection hanging is not the issue we have.
I'll update the inline note to provide more detail as to the anticipated life time of the retry flow and why and when to remove it. Meanwhile I'll use this branch to do integration tests in non-production environments for STREAM-437.
The integration tests (performed in dev) were performed and the image with the new package installed is working as intended. |
This PR adds
serverless-plugin-aws-alerts
plugin to the npm dependency, modifying the existing images.The need for the new dependency arose due to https://github.com/sleepio/flows-cli/pull/33 in the context of https://bighealth.atlassian.net/browse/STREAM-437, where we are adding baseline configuration for CloudWatch metrics alerts.
This implementation may be preferred over #29, if adding the dependency causes no issues in production.