Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing machines from Joyent data center #2552

Closed
6 of 7 tasks
mhdawson opened this issue Feb 28, 2021 · 20 comments
Closed
6 of 7 tasks

Replacing machines from Joyent data center #2552

mhdawson opened this issue Feb 28, 2021 · 20 comments
Labels
ansible ci-change PSA of configuration changes platform:smartos

Comments

@mhdawson
Copy link
Member

mhdawson commented Feb 28, 2021

We found out late Friday that the machines in the Joyent data center needed to be moved by next week. These have been moved:

ip hostname changes halted old?
147.75.88.51 test-joyent-ubuntu1804-x64-1 updated firewall on ci yes
147.75.88.54 test-joyent-smartos18-x64-3 updated firewall on ci yes
147.75.88.55 test-joyent-smartos18-x64-4 updated firewall on ci yes
147.75.88.53 release-joyent-smartos18-x64-2 updated firewall on ci-release yes
147.75.88.56 release-joyent-ubuntu1804_docker-x64-1 updated firewall on ci-release yes
147.75.88.58 test-joyent-ubuntu1604_arm_cross-x64-1 updated firewall on ci yes
147.75.88.52 release-joyent-ubuntu1604_arm_cross-x64-1 updated firewall on ci-release yes
147.75.88.59 test-joyent-smartos17-x64-3 updated firewall on ci yes
147.75.88.60 release-joyent-smartos17-x64-2 updated firewall on release-ci yes
147.75.88.61 test-joyent-smartos17-x64-4 updated firewall on ci yes
147.75.88.57 infra-joyent-ubuntu1604-x64-1-new/unencrypted updated cloudflare yes
147.75.88.62 infra-joyent-debian10-x64-1/grafana updated cloudflare, updated telegraf configs on many machines yes
139.178.83.227 infra-joyent-smartos15-x64-1-new sniff test that looks ok yes

The smartos machines below 17 will not be moved over.

We'll need to:

  • Update the ansible inventory file with the new IPs
  • remove smartos16 from node-test-commit-smartos in test ci
  • remove smartos 16 from node-test-addon-api-new
  • remove smartos 15, 16 from the libuv tests - FYI to libuv team asking that they do that - Smartos 15 and 16 machines will no longer be available libuv/libuv#3121
  • remove the smartos 15,16 machines from the CIs and inventory
  • figure out how we can reboot/manage the new machines (ie is there a management system we'll have access to)
  • See if there is a good way to test the new release machines in before the next 10, 12 releases (which are the only
    ones that we current release on for smartos)
@jbergstroem
Copy link
Member

Just chipping in, me and Michael have been doing most of the effort up to now with lots of help from a person at Joyent. I will tomorrow review the backup and www mirror seeing how these are very important to our operations. The final switch for these VM's are likely Monday.

@richardlau richardlau added ansible ci-change PSA of configuration changes platform:smartos labels Feb 28, 2021
@mhdawson
Copy link
Member Author

mhdawson commented Mar 1, 2021

@richardlau can you take care of the libuv side - making sure team is aware and removing smartos 15,16 from the CI before Wednesday.

@richardlau
Copy link
Member

@richardlau can you take care of the libuv side - making sure team is aware and removing smartos 15,16 from the CI before Wednesday.

I've done https://ci.nodejs.org/view/libuv/job/libuv-test-commit-smartos/ (see libuv/libuv#3121). As noted in libuv/libuv#3121 (comment) the only other libuv job runs https://ci.nodejs.org/job/node-test-commit/ so ends up running https://ci.nodejs.org/job/node-test-commit-smartos/ which will need to be updated but that's a Node.js, not libuv, job.

FWIW these are the hits for smartos15/smartos16 in our CI configs after the libuv job was updated:

https://github.com/nodejs/jenkins-config-release

$ grep 'smartos\(15\|16\)' jobs/*.xml
jobs/iojs+release-mdawson-major.xml:        <string>smartos15-release</string>
jobs/iojs+release.xml:        <string>smartos15-release</string>
$

https://github.com/nodejs/jenkins-config-test

$ grep 'smartos\(15\|16\)' jobs/*.xml
jobs/llnode-continuous-integration.xml:alpine-latest-x64 debian8-64 debian9-64 fedora-latest-x64 centos7-ppcle ubuntu1404-64 ubuntu1604-64 rhel72-s390x smartos16-64 smartos15-64 win10 win2012r2</description>
jobs/llnode-pipeline.xml:      debian8-64 fedora22 fedora23 centos7-ppcle ubuntu1404-64 ubuntu1604-64 rhel72-s390x aix71-ppc64 smartos16-64 smartos15-64 win10 win2012r2</description>
jobs/node-inspect.xml:          <description>Space-separated list of nodes, e.g. debian8-64 centos7-ppcle ubuntu1404-64 ubuntu1604-64 rhel72-s390x aix71-ppc64 smartos16-64 smartos15-64 win10 win2012r2</description>
jobs/node-inspect.xml:      description: 'Space-separated list of nodes, e.g. debian8-64 centos7-ppcle ubuntu1404-64 ubuntu1604-64 rhel72-s390x aix71-ppc64
smartos16-64 smartos15-64 win10 win2012r2',
jobs/node-stress-single-test.xml:        <string>smartos15-64</string>
jobs/node-stress-single-test.xml:        <string>smartos16-64</string>
jobs/node-test-commit-smartos.xml:        <string>smartos15-64</string>
jobs/node-test-node-addon-api-new.xml:ubuntu1804-64 smartos16-64 smartos15-64 win10 win2012r2 aix71-ppc64 centos7-ppcle</description>
jobs/node-test-node-addon-api.xml:smartos16-64 smartos15-64 win10 win2012r2 aix71-ppc64</description>
jobs/node-test-node-addon-api.xml:        <string>smartos16-64</string>
jobs/node-test-node-addon-api.xml:        <string>smartos15-64</string>
jobs/node-test-node-addon-api.xml:  <combinationFilter>(MACHINES=="all" || MACHINES.contains(MACHINE)) &amp;&amp; !((NODE_VERSION=="v4" || NODE_VERSION=="v5") &amp;&amp; (MACHINE.contains("s390") || MACHINE.contains("aix"))) &amp;&amp; !((NODE_VERSION=="v6" || NODE_VERSION=="v7") &amp;&amp; (MACHINE.contains("smartos15") || MACHINE.contains("smartos16")))</combinationFilter>
jobs/readable-stream-continuous-integration.xml:smartos16-64 smartos15-64 win10 win2012r2</description>
jobs/readable-stream-pipeline.xml:smartos16-64 smartos15-64 win10 win2012r2</description>
jobs/string_decoder-continuous-integration.xml:smartos16-64 smartos15-64 win10 win2012r2</description>
jobs/string_decoder-pipeline.xml:smartos16-64 smartos15-64 win10 win2012r2</description>
$

Anything in a description is benign (although we can tidy up for appearances) but any <string>...</string> instances need to be updated. Instances in <combinationFilter>...</combinationFilter> could also be tidied up but won't cause problems (they'd just be redundant filtering).

@mhdawson I think node-test-node-addon-api has been superseded... https://ci.nodejs.org/job/node-test-node-addon-api/ is currently disabled. Perhaps it can be removed entirely?

Let me know if you'd like/need me to update the Node.js release and test jobs.

@richardlau
Copy link
Member

richardlau commented Mar 1, 2021

Also FWIW in terms of release machines we build Node.js 10 releases on smartos17 and Node.js 12 releases on smartos18 (i.e. we don't use smartos15/smartos16 in the release CI). The reference to smartos15 in iojs+release is something we could have cleaned up previously -- probably when Node.js 8 went End-of-Life according to

// SmartOS -----------------------------------------------
[ /^smartos15/, anyType, lt(8) ],
[ /^smartos15/, anyType, gte(10) ],
[ /^smartos16/, anyType, lt(8) ],
[ /^smartos16/, anyType, gte(12) ],
[ /^smartos17/, anyType, lt(10) ],
[ /^smartos17/, anyType, gte(12) ],
[ /^smartos18/, anyType, lt(12) ],
[ /^smartos18/, releaseType, gte(14) ],

@mhdawson
Copy link
Member Author

mhdawson commented Mar 1, 2021

Yes, I figure out we don't use 15 and 16 which is why I asked that we bring over smartos17 but not those 2. (did not look like there were enough IPs to bring them all over)

@mhdawson
Copy link
Member Author

mhdawson commented Mar 1, 2021

@mhdawson I think node-test-node-addon-api has been superseded... https://ci.nodejs.org/job/node-test-node-addon-api/ is currently disabled. Perhaps it can be removed entirely?

Deleted.

@mhdawson
Copy link
Member Author

mhdawson commented Mar 1, 2021

@richardlau can you take care of removing the references from the rest of the places you mentioned.

@mhdawson
Copy link
Member Author

mhdawson commented Mar 1, 2021

I delete the job related to jobs/iojs+release-mdawson-major.xml: <string>smartos15-release</string>

@jbergstroem
Copy link
Member

Small update based on last nights work: we only have one machine left to migrate which is going to be done today.

@richardlau
Copy link
Member

I've removed the release and test smartos 15 and 16 machines from secrets and opened a PR to remove from the ansible inventory: #2554

@richardlau
Copy link
Member

I've also removed the release and test smartos 15 and 16 machines from the CI.

@jbergstroem
Copy link
Member

Update: the backup machine completed the jobs successfully. I've had to remove the bits where we try to backup our mysql for benchmarks due to that machine not being available any longer. This should conclude the migration. Next steps is opening PR(s) to update our repo.

@mhdawson
Copy link
Member Author

mhdawson commented Mar 3, 2021

We've turned off all of the old machines and the Joyent data-center is being de-commissioned today. We'll know soon if we updated/migrated everything that was needed :)

@richardlau
Copy link
Member

Kicked off test builds on smartos only to validate the replacement release machines:
10.x: nodejs/node@ab8d3c5a12: https://ci-release.nodejs.org/job/iojs+release/6715/
12.x, nodejs/node@ce800870b4: https://ci-release.nodejs.org/job/iojs+release/6716/

@richardlau
Copy link
Member

Did some basic tests (node -p process.versions, 2+2 in repl) with

No issues to report, so I'm happy the new release machines look okay.

@richardlau
Copy link
Member

Based on the table in the description I've opened #2568 to update the ansible inventory.

@richardlau
Copy link
Member

@mhdawson @jbergstroem the one remaining task here is

  • figure out how we can reboot/manage the new machines (ie is there a management system we'll have access to)

I'm guessing at this point that the current "process" if we need to do any of that for the Joyent machines is to contact @bahamat? I can update the outdated notes in the secrets repo and close this off, otherwise I'd suggest we open another issue if it's something we want to keep open.

@mhdawson
Copy link
Member Author

@richardlau +1. Lets just document and close this out.

@richardlau
Copy link
Member

I've updated the notes for the Joyent machines in the secrets repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ansible ci-change PSA of configuration changes platform:smartos
Projects
None yet
Development

No branches or pull requests

3 participants