-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(api-client): default consortium provider timeout test flake #656
Labels
bug
Something isn't working
Comments
FYI @kikoncuo @jagpreetsinghsasan @AzaharaC @jordigiam I believe this will be fixed once the other PRs get merged that resolve the "disk full" issue. For now my working theory that this flake was also caused by that issue, so hopefully all the failing CI checks will turn green soon. |
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 23, 2021
Fixes hyperledger-cacti#656 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 23, 2021
Fixes hyperledger-cacti#656 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
…ledger-cacti#656 Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. This change makes it so that the pullImage(...) method of the Containers utility class will now - by default - retry 6 times if the docker image pulling has failed. The internval between retries is increasing exponentially (power of two) starting from one second as the delay then proceeding to be 2^6 seconds for the final retry (which if also fails then an AbortError is thrown by the underlying pRetry library that is powering the retry mechanism.) For reference, here is a randomly failed CI test execution where the logs show that DockerHub is randomly in- accessible over the network and that's another thing that makes our tests flaky, hence this commit to fix this. https://github.com/hyperledger/cactus/runs/2178802580?check_suite_focus=true#step:8:2448 In case that link goes dead in the future, here's also the actual logs: not ok 60 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts # time=25389.665ms --- env: TS_NODE_COMPILER_OPTIONS: '{"jsx":"react"}' file: packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts timeout: 1800000 command: /opt/hostedtoolcache/node/12.13.0/x64/bin/node args: - -r - /home/runner/work/cactus/cactus/node_modules/ts-node/register/index.js - --max-old-space-size=4096 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts stdio: - 0 - pipe - 2 cwd: /home/runner/work/cactus/cactus exitCode: 1 ... { # NodeJS API server + Rust plugin work together [2021-03-23T20:45:51.458Z] INFO (VaultTestServer): Created VaultTestServer OK. Image FQN: vault:1.6.1 not ok 1 Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) --- operator: error at: bound (/home/runner/work/cactus/cactus/node_modules/onetime/index.js:30:12) stack: |- Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) at /home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:301:17 at IncomingMessage.<anonymous> (/home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:328:9) at IncomingMessage.emit (events.js:215:7) at endReadableNT (_stream_readable.js:1183:12) at processTicksAndRejections (internal/process/task_queues.js:80:21) ... Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) } Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. An attempt to fix the mysterious error in the CI that can be seen at the bottom. Based off of the advice of a fellow internet user as seen here: https://stackoverflow.com/a/61789467 No idea if this will fix the particular error that we are trying to fix or not, but we have to try. The underlying issue seems to be a bug in npm itself, but knowing that doesn't disappear the need to find a workaround so here we go... Error logs and link: ---------------------------- Link: https://github.com/hyperledger/cactus/runs/2179881505?check_suite_focus=true#step:5:8 Logs: Run npm ci npm ci shell: /usr/bin/bash -e {0} env: JAVA_HOME_8.0.275_x64: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME_8_0_275_X64: /opt/hostedtoolcache/jdk/8.0.275/x64 npm ERR! cb() never called! npm ERR! This is an error with npm itself. Please report this error at: npm ERR! <https://npm.community> Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. An attempt to fix the mysterious issue with npm ci Based on a true story: https://stackoverflow.com/a/15483897 CI failure logs: https://github.com/hyperledger/cactus/runs/2179881505?check_suite_focus=true#step:5:8 Logs ------ npm ci shell: /usr/bin/bash -e {0} env: JAVA_HOME_8.0.275_x64: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME_8_0_275_X64: /opt/hostedtoolcache/jdk/8.0.275/x64 npm ERR! cb() never called! npm ERR! This is an error with npm itself. Please report this error at: npm ERR! <https://npm.community> Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
…ger-cacti#656 This is yet another attempt at potentially fixing all the remaining CI flakes that only happen on the GitHub Action runners but never on developer machines. Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
This was referenced Mar 24, 2021
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
Bumping up the test timesouts to a full hour becauase under heavy load the GHA runner seems to be extremely slow, meaning that the fabric tests can take longer than half an hour each despite the fact that these usually take about 5 minutes or less even on the slow GHA runners. Fixes hyperledger-cacti#656 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
…ledger-cacti#656 Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. This change makes it so that the pullImage(...) method of the Containers utility class will now - by default - retry 6 times if the docker image pulling has failed. The internval between retries is increasing exponentially (power of two) starting from one second as the delay then proceeding to be 2^6 seconds for the final retry (which if also fails then an AbortError is thrown by the underlying pRetry library that is powering the retry mechanism.) For reference, here is a randomly failed CI test execution where the logs show that DockerHub is randomly in- accessible over the network and that's another thing that makes our tests flaky, hence this commit to fix this. https://github.com/hyperledger/cactus/runs/2178802580?check_suite_focus=true#step:8:2448 In case that link goes dead in the future, here's also the actual logs: not ok 60 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts # time=25389.665ms --- env: TS_NODE_COMPILER_OPTIONS: '{"jsx":"react"}' file: packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts timeout: 1800000 command: /opt/hostedtoolcache/node/12.13.0/x64/bin/node args: - -r - /home/runner/work/cactus/cactus/node_modules/ts-node/register/index.js - --max-old-space-size=4096 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts stdio: - 0 - pipe - 2 cwd: /home/runner/work/cactus/cactus exitCode: 1 ... { # NodeJS API server + Rust plugin work together [2021-03-23T20:45:51.458Z] INFO (VaultTestServer): Created VaultTestServer OK. Image FQN: vault:1.6.1 not ok 1 Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) --- operator: error at: bound (/home/runner/work/cactus/cactus/node_modules/onetime/index.js:30:12) stack: |- Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) at /home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:301:17 at IncomingMessage.<anonymous> (/home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:328:9) at IncomingMessage.emit (events.js:215:7) at endReadableNT (_stream_readable.js:1183:12) at processTicksAndRejections (internal/process/task_queues.js:80:21) ... Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) } Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. An attempt to fix the mysterious error in the CI that can be seen at the bottom. Based off of the advice of a fellow internet user as seen here: https://stackoverflow.com/a/61789467 No idea if this will fix the particular error that we are trying to fix or not, but we have to try. The underlying issue seems to be a bug in npm itself, but knowing that doesn't disappear the need to find a workaround so here we go... Error logs and link: ---------------------------- Link: https://github.com/hyperledger/cactus/runs/2179881505?check_suite_focus=true#step:5:8 Logs: Run npm ci npm ci shell: /usr/bin/bash -e {0} env: JAVA_HOME_8.0.275_x64: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME_8_0_275_X64: /opt/hostedtoolcache/jdk/8.0.275/x64 npm ERR! cb() never called! npm ERR! This is an error with npm itself. Please report this error at: npm ERR! <https://npm.community> Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. An attempt to fix the mysterious issue with npm ci Based on a true story: https://stackoverflow.com/a/15483897 CI failure logs: https://github.com/hyperledger/cactus/runs/2179881505?check_suite_focus=true#step:5:8 Logs ------ npm ci shell: /usr/bin/bash -e {0} env: JAVA_HOME_8.0.275_x64: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME_8_0_275_X64: /opt/hostedtoolcache/jdk/8.0.275/x64 npm ERR! cb() never called! npm ERR! This is an error with npm itself. Please report this error at: npm ERR! <https://npm.community> Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
…ger-cacti#656 This is yet another attempt at potentially fixing all the remaining CI flakes that only happen on the GitHub Action runners but never on developer machines. Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
This was referenced Mar 24, 2021
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
Bumping up the test timesouts to a full hour becauase under heavy load the GHA runner seems to be extremely slow, meaning that the fabric tests can take longer than half an hour each despite the fact that these usually take about 5 minutes or less even on the slow GHA runners. Fixes hyperledger-cacti#656 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
…ledger-cacti#656 Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. This change makes it so that the pullImage(...) method of the Containers utility class will now - by default - retry 6 times if the docker image pulling has failed. The internval between retries is increasing exponentially (power of two) starting from one second as the delay then proceeding to be 2^6 seconds for the final retry (which if also fails then an AbortError is thrown by the underlying pRetry library that is powering the retry mechanism.) For reference, here is a randomly failed CI test execution where the logs show that DockerHub is randomly in- accessible over the network and that's another thing that makes our tests flaky, hence this commit to fix this. https://github.com/hyperledger/cactus/runs/2178802580?check_suite_focus=true#step:8:2448 In case that link goes dead in the future, here's also the actual logs: not ok 60 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts # time=25389.665ms --- env: TS_NODE_COMPILER_OPTIONS: '{"jsx":"react"}' file: packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts timeout: 1800000 command: /opt/hostedtoolcache/node/12.13.0/x64/bin/node args: - -r - /home/runner/work/cactus/cactus/node_modules/ts-node/register/index.js - --max-old-space-size=4096 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts stdio: - 0 - pipe - 2 cwd: /home/runner/work/cactus/cactus exitCode: 1 ... { # NodeJS API server + Rust plugin work together [2021-03-23T20:45:51.458Z] INFO (VaultTestServer): Created VaultTestServer OK. Image FQN: vault:1.6.1 not ok 1 Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) --- operator: error at: bound (/home/runner/work/cactus/cactus/node_modules/onetime/index.js:30:12) stack: |- Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) at /home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:301:17 at IncomingMessage.<anonymous> (/home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:328:9) at IncomingMessage.emit (events.js:215:7) at endReadableNT (_stream_readable.js:1183:12) at processTicksAndRejections (internal/process/task_queues.js:80:21) ... Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) } Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. An attempt to fix the mysterious error in the CI that can be seen at the bottom. Based off of the advice of a fellow internet user as seen here: https://stackoverflow.com/a/61789467 No idea if this will fix the particular error that we are trying to fix or not, but we have to try. The underlying issue seems to be a bug in npm itself, but knowing that doesn't disappear the need to find a workaround so here we go... Error logs and link: ---------------------------- Link: https://github.com/hyperledger/cactus/runs/2179881505?check_suite_focus=true#step:5:8 Logs: Run npm ci npm ci shell: /usr/bin/bash -e {0} env: JAVA_HOME_8.0.275_x64: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME_8_0_275_X64: /opt/hostedtoolcache/jdk/8.0.275/x64 npm ERR! cb() never called! npm ERR! This is an error with npm itself. Please report this error at: npm ERR! <https://npm.community> Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. An attempt to fix the mysterious issue with npm ci Based on a true story: https://stackoverflow.com/a/15483897 CI failure logs: https://github.com/hyperledger/cactus/runs/2179881505?check_suite_focus=true#step:5:8 Logs ------ npm ci shell: /usr/bin/bash -e {0} env: JAVA_HOME_8.0.275_x64: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME_8_0_275_X64: /opt/hostedtoolcache/jdk/8.0.275/x64 npm ERR! cb() never called! npm ERR! This is an error with npm itself. Please report this error at: npm ERR! <https://npm.community> Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 24, 2021
…ger-cacti#656 This is yet another attempt at potentially fixing all the remaining CI flakes that only happen on the GitHub Action runners but never on developer machines. Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
that referenced
this issue
Mar 25, 2021
Bumping up the test timesouts to a full hour becauase under heavy load the GHA runner seems to be extremely slow, meaning that the fabric tests can take longer than half an hour each despite the fact that these usually take about 5 minutes or less even on the slow GHA runners. Fixes #656 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
that referenced
this issue
Mar 25, 2021
Potentially fixing #656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. This change makes it so that the pullImage(...) method of the Containers utility class will now - by default - retry 6 times if the docker image pulling has failed. The internval between retries is increasing exponentially (power of two) starting from one second as the delay then proceeding to be 2^6 seconds for the final retry (which if also fails then an AbortError is thrown by the underlying pRetry library that is powering the retry mechanism.) For reference, here is a randomly failed CI test execution where the logs show that DockerHub is randomly in- accessible over the network and that's another thing that makes our tests flaky, hence this commit to fix this. https://github.com/hyperledger/cactus/runs/2178802580?check_suite_focus=true#step:8:2448 In case that link goes dead in the future, here's also the actual logs: not ok 60 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts # time=25389.665ms --- env: TS_NODE_COMPILER_OPTIONS: '{"jsx":"react"}' file: packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts timeout: 1800000 command: /opt/hostedtoolcache/node/12.13.0/x64/bin/node args: - -r - /home/runner/work/cactus/cactus/node_modules/ts-node/register/index.js - --max-old-space-size=4096 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts stdio: - 0 - pipe - 2 cwd: /home/runner/work/cactus/cactus exitCode: 1 ... { # NodeJS API server + Rust plugin work together [2021-03-23T20:45:51.458Z] INFO (VaultTestServer): Created VaultTestServer OK. Image FQN: vault:1.6.1 not ok 1 Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) --- operator: error at: bound (/home/runner/work/cactus/cactus/node_modules/onetime/index.js:30:12) stack: |- Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) at /home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:301:17 at IncomingMessage.<anonymous> (/home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:328:9) at IncomingMessage.emit (events.js:215:7) at endReadableNT (_stream_readable.js:1183:12) at processTicksAndRejections (internal/process/task_queues.js:80:21) ... Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) } Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
that referenced
this issue
Mar 25, 2021
Potentially fixing #656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. An attempt to fix the mysterious error in the CI that can be seen at the bottom. Based off of the advice of a fellow internet user as seen here: https://stackoverflow.com/a/61789467 No idea if this will fix the particular error that we are trying to fix or not, but we have to try. The underlying issue seems to be a bug in npm itself, but knowing that doesn't disappear the need to find a workaround so here we go... Error logs and link: ---------------------------- Link: https://github.com/hyperledger/cactus/runs/2179881505?check_suite_focus=true#step:5:8 Logs: Run npm ci npm ci shell: /usr/bin/bash -e {0} env: JAVA_HOME_8.0.275_x64: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME_8_0_275_X64: /opt/hostedtoolcache/jdk/8.0.275/x64 npm ERR! cb() never called! npm ERR! This is an error with npm itself. Please report this error at: npm ERR! <https://npm.community> Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
that referenced
this issue
Mar 25, 2021
Potentially fixing #656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. An attempt to fix the mysterious issue with npm ci Based on a true story: https://stackoverflow.com/a/15483897 CI failure logs: https://github.com/hyperledger/cactus/runs/2179881505?check_suite_focus=true#step:5:8 Logs ------ npm ci shell: /usr/bin/bash -e {0} env: JAVA_HOME_8.0.275_x64: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME_8_0_275_X64: /opt/hostedtoolcache/jdk/8.0.275/x64 npm ERR! cb() never called! npm ERR! This is an error with npm itself. Please report this error at: npm ERR! <https://npm.community> Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
that referenced
this issue
Mar 25, 2021
This is yet another attempt at potentially fixing all the remaining CI flakes that only happen on the GitHub Action runners but never on developer machines. Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 25, 2021
Bumping up the test timesouts to a full hour becauase under heavy load the GHA runner seems to be extremely slow, meaning that the fabric tests can take longer than half an hour each despite the fact that these usually take about 5 minutes or less even on the slow GHA runners. Fixes hyperledger-cacti#656 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
petermetz
added a commit
to petermetz/cacti
that referenced
this issue
Mar 26, 2021
Bumping up the test timesouts to a full hour becauase under heavy load the GHA runner seems to be extremely slow, meaning that the fabric tests can take longer than half an hour each despite the fact that these usually take about 5 minutes or less even on the slow GHA runners. Fixes hyperledger-cacti#656 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
jordigiam
pushed a commit
to kikoncuo/cactus
that referenced
this issue
Apr 8, 2021
Bumping up the test timesouts to a full hour becauase under heavy load the GHA runner seems to be extremely slow, meaning that the fabric tests can take longer than half an hour each despite the fact that these usually take about 5 minutes or less even on the slow GHA runners. Fixes hyperledger-cacti#656 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
jordigiam
pushed a commit
to kikoncuo/cactus
that referenced
this issue
Apr 8, 2021
…ledger-cacti#656 Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. This change makes it so that the pullImage(...) method of the Containers utility class will now - by default - retry 6 times if the docker image pulling has failed. The internval between retries is increasing exponentially (power of two) starting from one second as the delay then proceeding to be 2^6 seconds for the final retry (which if also fails then an AbortError is thrown by the underlying pRetry library that is powering the retry mechanism.) For reference, here is a randomly failed CI test execution where the logs show that DockerHub is randomly in- accessible over the network and that's another thing that makes our tests flaky, hence this commit to fix this. https://github.com/hyperledger/cactus/runs/2178802580?check_suite_focus=true#step:8:2448 In case that link goes dead in the future, here's also the actual logs: not ok 60 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts # time=25389.665ms --- env: TS_NODE_COMPILER_OPTIONS: '{"jsx":"react"}' file: packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts timeout: 1800000 command: /opt/hostedtoolcache/node/12.13.0/x64/bin/node args: - -r - /home/runner/work/cactus/cactus/node_modules/ts-node/register/index.js - --max-old-space-size=4096 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts stdio: - 0 - pipe - 2 cwd: /home/runner/work/cactus/cactus exitCode: 1 ... { # NodeJS API server + Rust plugin work together [2021-03-23T20:45:51.458Z] INFO (VaultTestServer): Created VaultTestServer OK. Image FQN: vault:1.6.1 not ok 1 Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) --- operator: error at: bound (/home/runner/work/cactus/cactus/node_modules/onetime/index.js:30:12) stack: |- Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) at /home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:301:17 at IncomingMessage.<anonymous> (/home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:328:9) at IncomingMessage.emit (events.js:215:7) at endReadableNT (_stream_readable.js:1183:12) at processTicksAndRejections (internal/process/task_queues.js:80:21) ... Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) } Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
jordigiam
pushed a commit
to kikoncuo/cactus
that referenced
this issue
Apr 8, 2021
Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. An attempt to fix the mysterious error in the CI that can be seen at the bottom. Based off of the advice of a fellow internet user as seen here: https://stackoverflow.com/a/61789467 No idea if this will fix the particular error that we are trying to fix or not, but we have to try. The underlying issue seems to be a bug in npm itself, but knowing that doesn't disappear the need to find a workaround so here we go... Error logs and link: ---------------------------- Link: https://github.com/hyperledger/cactus/runs/2179881505?check_suite_focus=true#step:5:8 Logs: Run npm ci npm ci shell: /usr/bin/bash -e {0} env: JAVA_HOME_8.0.275_x64: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME_8_0_275_X64: /opt/hostedtoolcache/jdk/8.0.275/x64 npm ERR! cb() never called! npm ERR! This is an error with npm itself. Please report this error at: npm ERR! <https://npm.community> Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
jordigiam
pushed a commit
to kikoncuo/cactus
that referenced
this issue
Apr 8, 2021
Potentially fixing hyperledger-cacti#656. Definitely improves the situation but it is impossible to tell in advance if this will make all the other- wise non-reproducible issues go away. Fingers crossed. An attempt to fix the mysterious issue with npm ci Based on a true story: https://stackoverflow.com/a/15483897 CI failure logs: https://github.com/hyperledger/cactus/runs/2179881505?check_suite_focus=true#step:5:8 Logs ------ npm ci shell: /usr/bin/bash -e {0} env: JAVA_HOME_8.0.275_x64: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME: /opt/hostedtoolcache/jdk/8.0.275/x64 JAVA_HOME_8_0_275_X64: /opt/hostedtoolcache/jdk/8.0.275/x64 npm ERR! cb() never called! npm ERR! This is an error with npm itself. Please report this error at: npm ERR! <https://npm.community> Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
jordigiam
pushed a commit
to kikoncuo/cactus
that referenced
this issue
Apr 8, 2021
…ger-cacti#656 This is yet another attempt at potentially fixing all the remaining CI flakes that only happen on the GitHub Action runners but never on developer machines. Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
The mentioned test case is flaky because we depend on an external service to simulate certain HTTP status codes for different failure modes.
We need to migrate over to something like this1 to simulate HTTP failure scenarios in-house without having to go beyond the local network interface so that the CI will not depend on external services that can experience downtime (well, we still get this with DockerHub sometimes, but it is what it is...)
To Reproduce
It's a flaky test so it will happen randomly if you try hard enough and for long enough.
Expected behavior
No flaky tests, ever.
Logs/Stack traces
N/A
Screenshots
N/A
Cloud provider or hardware configuration:
Always happens on the Github Actions CI runner.
Operating system name, version, build:
Ubuntu LTS
Hyperledger Cactus release version or commit (git rev-parse --short HEAD):
main
Hyperledger Cactus Plugins/Connectors Used
N/A
Additional context
This has been worked on before but it seems the earlier fix has failed to deliver on its promises so we have to get back at it again.
cc: @takeutak @sfuji822 @hartm @jonathan-m-hamilton @AzaharaC @jordigiam @kikoncuo
The text was updated successfully, but these errors were encountered: