Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QA testing in CI/CD #35

Merged
merged 3 commits into from
Nov 21, 2021
Merged

QA testing in CI/CD #35

merged 3 commits into from
Nov 21, 2021

Conversation

tegefaulkes
Copy link
Contributor

@tegefaulkes tegefaulkes commented Nov 12, 2021

Description

This PR is for testing a QA stage in the Gitlab CI/CD for testing the packaged executables in different environments.

Tasks

  1. Create a QA stage
  2. Create a job for testing if the program runs successfully.
  3. test if we can provide the built executable as an artefact between jobs.
  4. test if we can run the built executable in different environment using different runners.
  5. - Focus only on Windows and Linux runners
  6. - Remove runDocker.sh and embed that in .gitlab-ci.yml figure out how to do command substitution
  7. - Make sure that nix-build works locally on NixOS/nix-shell environment
  8. Release new package

Final checklist

  • Domain specific tests
  • Full tests
  • Updated inline-comment documentation
  • Lint fixed
  • Squash and rebased
  • Sanity check the final build

@tegefaulkes tegefaulkes added the development Standard development label Nov 12, 2021
@tegefaulkes tegefaulkes self-assigned this Nov 12, 2021
@tegefaulkes
Copy link
Contributor Author

There seems to be a problem with symlinks and the artifacts so right now i'm setting things up in a way to avoid them.

@tegefaulkes
Copy link
Contributor Author

Progress! Looks like we can share the built executables using the artifacts and test them in different OSes in separate runners for each job. right now I've gotten windows, Linux and application builds working. macos I need access to use so i'll look into that.

How we share the artifacts is by specifying a artifact path for the job with

  artifacts:
    paths:
      - builds

and obtaining them in the test job with

  needs:
    - job: nix
      artifacts: true

@tegefaulkes
Copy link
Contributor Author

Just a note, I had to disable the version and .json test while working on this.

@CMCDragonkai
Copy link
Member

I'm looking into this now.

@CMCDragonkai CMCDragonkai self-requested a review November 19, 2021 05:57
@CMCDragonkai
Copy link
Member

I'm adding back the test and version utility methods. This should work in the final executable too since we are doing similar things in js-polykey so it should be tested.

@CMCDragonkai
Copy link
Member

Due to MatrixAI/Polykey#4, we don't have access to macos in this repo. So for now we will just remove these CI/CD jobs.

@CMCDragonkai
Copy link
Member

@tegefaulkes you also said that nix-build was failing here, I'll be trying this too. But do you have any extra details.

@CMCDragonkai
Copy link
Member

Currently:

  1. Tests pass
  2. NPM build works npm run build, and the dist result has the right structure
  3. Docs are built
  4. Lintfixed
  5. The npm run typescript-demo-lib works
  6. The built js works too node ./dist/bin/typescript-demo-lib

So now to test nix-build.

@CMCDragonkai
Copy link
Member

And so the problem is apparent.

> leveldown@6.1.0 install /nix/store/in5v5h6n20n39vxdr47gz23vdh4k9619-node-dependencies-_at_matrixai_slash_typescript-demo-lib-1.1.2/@matrixai/typescript-demo-lib/node_modules/leveldown
> node-gyp-build

unpacking source archive /nix/store/my8pdyprpfghacmp9ax1k9fqx8zfpvx2-istanbul-reports-3.0.1.tgz
unpacking source archive /nix/store/dy50b7wq2cpirc0xr4zwlm7dnyffjrsf-jest-27.0.2.tgz
unpacking source archive /nix/store/qa1jjbvf0f2w8a8kgrpx3j66l9fbgggn-json-schema-7.0.9.tgz
unpacking source archive /nix/store/4kls952yhg6af8jpn7k98wxsqz5ip7va-json5-0.0.29.tgz
unpacking source archive /nix/store/kj0900qn9r3kxmakns57n7bylch657vx-node-14.17.33.tgz
unpacking source archive /nix/store/6y0zx83wgm9ipkzwg5rk7r09hkmg68kx-prettier-2.4.2.tgz

> utp-native@2.5.3 install /nix/store/in5v5h6n20n39vxdr47gz23vdh4k9619-node-dependencies-_at_matrixai_slash_typescript-demo-lib-1.1.2/@matrixai/typescript-demo-lib/node_modules/utp-native
> node-gyp-build

unpacking source archive /nix/store/ygywvs8hil1axvgx089zxhpq8bdr7q57-stack-utils-2.0.1.tgz
sh: /nix/store/in5v5h6n20n39vxdr47gz23vdh4k9619-node-dependencies-_at_matrixai_slash_typescript-demo-lib-1.1.2/@matrixai/typescript-demo-lib/node_modules/.bin/node-gyp-build: /usr/bin/env: bad interpreter: No such file or directory
npm ERR! code ELIFECYCLE
npm ERR! errno 126
npm ERR! utp-native@2.5.3 install: `node-gyp-build`
npm ERR! Exit status 126
npm ERR!
npm ERR! Failed at the utp-native@2.5.3 install script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /build/.npm/_logs/2021-11-19T06_55_59_711Z-debug.log
unpacking source archive /nix/store/v7y2sl233qhz586grmpgq2xzmm7s8l2q-yargs-16.0.4.tgz

builder for '/nix/store/brgx0z5xnwh5x2azjf9awlsqhrfrnk35-node-dependencies-_at_matrixai_slash_typescript-demo-lib-1.1.2.drv' failed with exit code 126
cannot build derivation '/nix/store/6xka7449a7y944qf2rlm0m2mw6x7y69g-typescript-demo-lib-1.1.2.drv': 1 dependencies couldn't be built
error: build of '/nix/store/6xka7449a7y944qf2rlm0m2mw6x7y69g-typescript-demo-lib-1.1.2.drv' failed

@CMCDragonkai
Copy link
Member

Nix build now works.

It appears the problem is because node-gyp-build version mismatch.

The master's pkgs.nix currently has 4.2.3.

»» ~/Projects/TypeScript-Demo-Lib
 ♖ nix repl ./pkgs.nix                                    (qa-testing) pts/3 18:03:07
Welcome to Nix version 2.3.11. Type :? for help.

Loading './pkgs.nix'...
Added 14584 variables.

nix-repl> pkgs.nodePackages.node-gyp-build.version
"4.2.3"

While the version required by the dependencies is:

[nix-shell:~/Projects/TypeScript-Demo-Lib]$ npm ls node-gyp-build
@matrixai/typescript-demo-lib@1.1.2 /home/cmcdragonkai/Projects/TypeScript-Demo-Lib
├─┬ fd-lock@1.2.0
│ └── node-gyp-build@4.3.0 
├─┬ level@7.0.1
│ └─┬ leveldown@6.1.0
│   └── node-gyp-build@4.3.0  deduped
└─┬ utp-native@2.5.3
  └── node-gyp-build@4.3.0  deduped

So by updating our pkgs.nix to the latest hash in master 8df865561fbc53922f1e801c3deeb53c12ce8c4f.

This ends up bringing in 4.3.0 of node-gyp-build.

Furthermore we have to rm -r node_modules and rm package-lock.json and then exit shell and re-enter the shell which brings it all back in.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 19, 2021

Ok it appears it's because utp-native doesn't build with node-gyp-build when the version is 4.3.0.

So one solution is to pin the dev dependency of node-gyp-build to 4.2.3.

    "node-gyp-build": "4.2.3",

In devDependencies. This actually still allows the leveldown to use 4.3.0 but it's bit unclean.

It may not be safe to downgrade the level right now. So it's a matter of making sure we use node-gyp-build at 4.2.3 for utp-native.

@CMCDragonkai
Copy link
Member

Long-term question is node-gyp-build and related native dependencies are flaky and all of their dependencies should be fixed to a specific version.

The dev dependencies are now:

    "node-gyp-build": "4.2.3",
    "pkg": "5.3.0",

And this allows us to be careful with what is working and what is not.

Anyway I've reverted the change to pkgs.nix, it's back to what it was before and the node-gyp-build is 4.2.3.

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 19, 2021

Also note that because we now require package.json we added:

    # copy the package.json
    cp ${utils.node2nixDev}/lib/node_modules/${utils.node2nixDev.packageName}/package.json $out/lib/node_modules/${utils.node2nixDev.packageName}/

Into default.nix which enables the application target to work.

This ensures that package.json is there with the dist:

[nix-shell:~/Projects/TypeScript-Demo-Lib/result/lib/node_modules/@matrixai/typescript-demo-lib]$ ls
dist/  node_modules/  package.json

This has to be done for js-polykey as well.

@CMCDragonkai
Copy link
Member

Note that needs is used to create "DAG" like jobs. These are different from "stage" based jobs. As in they don't need to be specified together.

Because currently the nix job is in build stage, then only if this succeeds will the next stage of quality start.

By default all artifacts are built, but we can specify specific dependencies with dependencies.

Details are all here: https://docs.gitlab.com/ee/ci/yaml/#dependencies

Anyway I believe since we are currently using stages instead of DAG: https://docs.gitlab.com/ee/ci/directed_acyclic_graph/

So we can remove the needs specifier. Plus the artifacts property doesn't seem to be documented in the CI/CD so I'll be removing that and replacing that with dependencies and artifacts.

@CMCDragonkai
Copy link
Member

Recommend in the future we do something like this:

git remote add cicd git@gitlab.com:MatrixAI/open-source/typescript-demo-lib.git

This makes it easier to always push up to the cicd instead of waiting for the mirroring.

@CMCDragonkai
Copy link
Member

The runDocker.sh script is not necessary because variables can be passed like this:

  script:
    - image="$(docker load --input ./builds/*docker* | cut -d' ' -f3)"
    - docker run "$image"

Important to note that awk isn't always available, best to use coreutils which contains cut over awk or perl. It's more portable.

Note that the configuration for docker runs with TLS enabled. According to the docs https://docs.gitlab.com/ee/ci/docker/using_docker_build.html#docker-in-docker-with-tls-disabled if we don't control the runner configuration we may need to use non-TLS. However the TLS version currently works with no errors. So I'll keep it as is. I asked a question about this here: https://stackoverflow.com/questions/39608736/docker-in-docker-with-gitlab-shared-runner-for-building-and-pushing-docker-image#comment123832664_39608823

@CMCDragonkai
Copy link
Member

I have a build failure on application run.

The application build failure is because the application build is a link to the /nix/store usually.

And the bin inside is a symlink to /nix/store as well..

lrwxrwxrwx 2 root root 91 Jan  1  1970 bin -> /nix/store/mywd0i4syvn6spdg1ssn9k8zg4g45x0h-typescript-demo-lib-1.1.2/lib/node_modules/.bin/

When this is copied into the builds directory and then packaged up. This symlink is being saved. And therefore when it is later unpacked into the application run image, that symlink doesn't exist, and therefore cannot be executed.

This is why when we ask it to execute ./builds/*/bin/typescript-demo-lib we get a not found error.

We could change the bin symlink somehow to be relative. However this is not a foolproof solution. The result of the nix-build ./release.nix -A application is not a "portable" format. Not even between nix machines. This result is only intended to be used on the Nix system where it was built and can be installed. This is because the result of a nix build produces a closure of outputs. And this is just 1 part of the output that might be pointing to other parts of the /nix/store.

Therefore if we want to test the resulting application build, we have to instead build the "closure" and export that closure to a Nix-capable system for them to run.

Example of this:

# put this into our artifacts
nix-store --export $(nix-store -qR ./result) > ./tmp/test.closure
# put this into the testing system...
nix-store --import ./tmp/test.closure

Note that nix-store -qR gets all the nix store paths required

[nix-shell:~/Projects/TypeScript-Demo-Lib]$ nix-store -qR ./result
  /nix/store/zqaqyidzsqc7z03g4ajgizy2lz1m19xz-libunistring-0.9.10
  /nix/store/xjjdyb66g3cxd5880zspazsp5f16lbxz-libidn2-2.3.1
  /nix/store/wvgyhnd3rn6dhxzbr5r71gx2q9mhgshj-glibc-2.32-48
  /nix/store/066y87dbiza1dygr5icfb306x5hagmv9-bzip2-1.0.6.0.2
  /nix/store/0ng6wv06zd6phmw16qxh0ay7b3gwvw1d-openssl-1.1.1k
  /nix/store/ip0pxdd49l1v3cmxsvw8ziwmqhyzg5pf-attr-2.4.48
  /nix/store/26vpasbj38nhj462kqclwp2i6s3hhdba-acl-2.3.1
  /nix/store/5kk54jcy2pwsns57d4iganj8k53hhvq5-libffi-3.3
  /nix/store/6kgfmzx90c1a6afqnbkz6qprkzss476k-mime-types-9
  /nix/store/h8hr442h0vgsjbhk87fbpgbnbf6v8kq6-gdbm-1.20
  /nix/store/vh5k4xa2zk5l5yrbppa4xjilqmb79ncp-zlib-1.2.11
  /nix/store/hmywamcj5bzfsy558zh074357q5dpqyq-sqlite-3.35.5
  /nix/store/m4x0hqp8amgh987v1862x9hpxpi5rsgj-ncurses-6.2
  /nix/store/sm3k327vqr5xdi4lflckzhn6vj0nkykz-xz-5.2.5
  /nix/store/vrdi9bb2rh2m8swjz59rqs0pl4llc3pp-readline-6.3p08
  /nix/store/x0dcb2rxlzf32g0ddfkqqz1sfcyx4yay-bash-4.4-p23
  /nix/store/y8dxgm8g9rxhxf26bxg45wj0yj7qn59k-expat-2.4.1
  /nix/store/6cfajs6lsy9b4wxp3jvyyl1g5x2pjmpr-python3-3.8.9
  /nix/store/937f5738d2frws07ixcpg5ip176pfss1-coreutils-8.32
  /nix/store/gmsv7hm0wd5siyhi4nsbn1aqpbcbi0cl-perl-5.32.1
  /nix/store/hhg53jz42w3f6lx06rf4bbvvlxd9svhn-openssl-1.1.1k-bin
  /nix/store/a6fls3b45il7g2xr3yv37j7nbi2gskfy-openssl-1.1.1k-dev
  /nix/store/axzis43gg7gprv4acbix3zi2lpx78rmx-zlib-1.2.11-dev
  /nix/store/g9mjpwkpix0j2ccjfhbq6hd9r8cgssa5-gcc-10.3.0-lib
  /nix/store/ks8j7bmpfyak4sgry4dc3p7bzcava670-icu4c-69.1
  /nix/store/nz1jwn5dbszimhdglcnwhxbyhbwcnwik-libuv-1.41.0
  /nix/store/wfwj6nikbs91l994hkhcbkand55xv3c3-icu4c-69.1-dev
  /nix/store/vx4kv0wl3n4xk5cq6l9ilvmhmvqz1fp9-nodejs-14.17.3
  /nix/store/mywd0i4syvn6spdg1ssn9k8zg4g45x0h-typescript-demo-lib-1.1.2

Alternatively one can just test the resulting build in the same job that builds them, because doing the above would only be testing if the nix copy closure is working properly. See: https://nixos.org/manual/nix/unstable/package-management/copy-closure.html

@CMCDragonkai
Copy link
Member

So now I'm outputting the closure into the artifact. One must be careful with this, since closure is reaching 262 MiB for typescript-demo-lib as it contains everything. It could be further helped by compression. The resulting size reduces down to 83 MiB. We can pipe in the compression when copying closures there.

@CMCDragonkai
Copy link
Member

Ok last problem is that the linux run is failing. I had changed to using an image alphine:latest now instead of originally just being a tag of linux. It's complaining that it cannot find the executable, but the artifact exists, could this mean that alpine:latest is missing some dependencies required to run the ELF executable?

@CMCDragonkai
Copy link
Member

Turns out the official alpine image is too small. It doesn't even have bash. https://stackoverflow.com/questions/40944479/docker-how-to-use-bash-with-an-alpine-based-docker-image. I wonder where it is attempting to run bash at all. I may try on a more likely distribution like ubuntu.

@CMCDragonkai
Copy link
Member

Ideally the execution on linux run, windows run, and macos run should be using:

for f in ./builds/*-linux-*; do "$f"; done

This ensures that it runs each executable one at a time.

Right now the globbing might end up acquiring multiple files.

I would add this right now, but I would need to find out how powershell can do a similar for loop as that's what is being used in the windows script. If someone has a powershell they can see how to do a for loop that can glob and acquire a list of items to execute within the loop.

Apparently something like this could work:

Get-ChildItem -File ./builds/*-win32-* | Foreach {$_.fullname}

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Nov 21, 2021

Linux build is working finally on ubuntu latest. So it indeed seems that alpine doesn't have bash.

Also I'm not sure where "bash" is being required by the package. Maybe vercel/pkg expects there to be bash.

@CMCDragonkai
Copy link
Member

@CMCDragonkai
Copy link
Member

Future windows related work should take note of what's available on the windows system: https://gitlab.com/gitlab-org/ci-cd/shared-runners/images/gcp/windows-containers/blob/main/cookbooks/preinstalled-software/README.md

@CMCDragonkai CMCDragonkai marked this pull request as ready for review November 21, 2021 07:40
@CMCDragonkai CMCDragonkai merged commit 3f80ef3 into master Nov 21, 2021
@CMCDragonkai CMCDragonkai deleted the qa-testing branch November 21, 2021 07:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development
Development

Successfully merging this pull request may close these issues.

2 participants