-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keeping old binaries to allow resuming fanned jobs #1496
Comments
If we can get this working before Code + Learn, that could be immensely helpful. |
I've changed the jobs, will monitor the temporary repo. For now the changes were minimal (changed the delete parameter to false). If all goes well, in a couple of months will remove the parameter and the build step. |
https://ci.nodejs.org/job/node-test-commit-windows-fanned/20827/
This job started 6hrs ago. Looks like it did not resume successfully |
It did not work because the test branch was named using:
Which changes in the resumed job. I've changed in to Let's see if that works |
I reverted that because now |
@refack not a git job, but it may receive GIT_COMMIT as a parameter. Here's another try: https://ci.nodejs.org/job/node-test-commit-windows-fanned/jobConfigHistory/showDiffFiles?timestamp1=2018-09-18_22-40-44×tamp2=2018-09-18_22-44-06 |
Resuming at the level of (My test run yesterday: https://ci.nodejs.org/job/node-test-commit-windows-fanned/20840/ and https://ci.nodejs.org/job/node-test-commit-windows-fanned/20841/ , @Trott failed run now: https://ci.nodejs.org/job/node-test-pull-request/17320/ and https://ci.nodejs.org/job/node-test-pull-request/17327/ .) I'm not sure whether we should ignore the |
Let's try it: https://ci.nodejs.org/job/node-test-commit-windows-fanned/jobConfigHistory/showDiffFiles?timestamp1=2018-09-18_22-44-06×tamp2=2018-09-20_23-08-36 . This might cause subtle issues in CI, if the same commit has more than one job running simultaneously. Since the branch name includes the commit SHA, it should be safe, but we might end up testing outdated rebases and that can be confusing. |
I think that is a bug. We should strive to have "resume" use the exact same parameters (in this case a SHA and not a ref for the rebase point). I'll look at that... |
IMHO we can consider skipping compilation job altogether, if we find the binary branch... |
I don't think that's going to be possible (but would be great to find a way!). The only thing you have is "rebase against the base branch", so you have to find out what the "base branch" is somewhere. Resuming at that level will calculate the SHA again... unless we find a way to know the job is a resume.
For new builds or rebuilds, it should always rebase against the latest and compile again. It would be nice to have the resume feature working well, but it's not worth interfering with fresh builds because of it. I've changed the compile job to delete the binary branch if found, before starting to compile. This is to make sure that if the compile job fails the test job doesn't simply pick an old branch and tests that. If the current approach has issues, we could try to propagate a branch name from the top, from the test-PR job. Then, allow resuming only on the test-commit level and below. But this is one more bit of complexity, so I'd avoid it if possible. |
needed two more changes
|
@joaocgreis PTAL
|
Currently, we delete the temporary branch at the end of the fanned jobs. This means that when a fanned job is resumed, if the compile job succeeded only the test job will run again, and will not be able to find the temporary branch from last run.
I'd like to stop deleting temporary branches at the end of the runs, but this change is not urgent so it should be discussed with @nodejs/build first. Also, it would be good to do this when we have people available in the next few days to put out any fire.
At first, this sounds like the size of the binary repo could become a problem. However, git keeps references around long after the branches are deleted, so this might have a very small impact. Note that we have a job that runs daily and deletes branches for which the Jenkins job is no longer accessible, so branches will still get deleted after about one week.
Emergency cleanup of the binary repo
The text was updated successfully, but these errors were encountered: