Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] skip docs-only jobs take #2 #8853

Merged
merged 10 commits into from
Dec 1, 2020
Merged

Conversation

stas00
Copy link
Contributor

@stas00 stas00 commented Nov 30, 2020

So we discovered CircleCI has a problem and pipeline.git.base_revision is unreliable - not always set - breaking the test. #8826 (comment)

We had a few PRs incorrectly skipping the jobs, as in this example: https://app.circleci.com/pipelines/github/huggingface/transformers/16541/workflows/17b20230-8d7c-4b36-813c-2681f2c8a977/jobs/128232

It's missing << pipeline.git.base_revision >> in

if git diff --name-only << pipeline.git.base_revision >>...<< pipeline.git.revision >> | egrep -qv '\.(md|rst)$'

resulting in:

if git diff --name-only ...5170e5381b9fccdfb9405d665ecee0515efc6453 | egrep -qv '\.(md|rst)$'

and hence fails the test. (it's missing the first hash before ...).

This PR checks that the external variables pipeline.git.base_revision and pipeline.git.revision are set before we do the test. Should one of them be not set, the whole test is skipped and the job continues normally, regardless of whether it's docs only or not.

Meanwhile I filed a question about why pipeline.git.base_revision is not always set:
https://discuss.circleci.com/t/pipeline-git-base-revision-is-often-empty-which-reliable-variable-to-use/38301

Let's merge it at a time that one of us can monitor the next few PRs in case we need to back it out again.

If you have to back it out - you only need to comment out this line: circleci step halt and leave the invocations in place.

@sgugger, @LysandreJik

@stas00 stas00 changed the title [wip] debugging CI skip docs [CI] skip docs-only jobs take #2 Dec 1, 2020
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting to the bottom of this @stas00 !

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes! I'll monitor the jobs this afternoon.

@LysandreJik LysandreJik merged commit 21db560 into huggingface:master Dec 1, 2020
@stas00 stas00 deleted the ci-skip-docs2 branch December 1, 2020 18:28
@LysandreJik
Copy link
Member

Github actions decided to be down right at the moment where I wanted to monitor 😞

@stas00
Copy link
Contributor Author

stas00 commented Dec 1, 2020

Github actions? We are only doing this on circleCI - so far all seems to be working fine.

@LysandreJik
Copy link
Member

Hah, was so disappointed I let it blind me and think it was all CI. Will continue.

@LysandreJik
Copy link
Member

I'm pretty sure the CI should have run on the latest pipeline, as it did in the ones preceding it: https://app.circleci.com/pipelines/github/huggingface/transformers?branch=conda-ci

Changes were done to .yml files.

@LysandreJik
Copy link
Member

I believe the issue might come from the fact that it is looking at the build commit (d26ca66) and comparing it to the previous build's commit (4780c80). It doesn't find any files, as the previous build commit (4780c8...) doesn't exist anymore, as I force pushed the branch, overwriting that commit's data.

Maybe checking for empty results in the diff would help in that regard?

@stas00
Copy link
Contributor Author

stas00 commented Dec 1, 2020

OK, I found when we don't have pipeline.git.base_revision defined - it happens when PR is opened via github file edit. As I just did here: #8884
You can see then what happens: https://app.circleci.com/pipelines/github/huggingface/transformers/16617/workflows/fea80cc9-2093-4053-b3c3-f315632ab3a6/jobs/129069

#!/bin/bash -eo pipefail
# pipeline.git.base_revision is not always defined, so only proceed if all external vars are defined
if test -n "" && test -n "32f03035ce5b23abd8a1659f24f04b298319ae78"
then
    if git diff --name-only ...32f03035ce5b23abd8a1659f24f04b298319ae78 | egrep -qv '\.(md|rst)$'
    then
        echo "Non-docs were modified in this PR, proceeding normally"
    else
        echo "Only docs were modified in this PR, quitting this job"
        circleci step halt
    fi
else
    echo "Can't perform skipping check w/o base_revision defined, continuing the job"
fi

Can't perform skipping check w/o base_revision defined, continuing the job

CircleCI received exit code 0

So the workaround worked. The job continued normally.

@sgugger
Copy link
Collaborator

sgugger commented Dec 1, 2020

Oh I always open PRs like that, so it was indeed the culprit for the two other times we saw that happen.
That does not make sense since the I open the PR on GitHub but the CI runs on a commit. So forget I said anything!

@stas00
Copy link
Contributor Author

stas00 commented Dec 1, 2020

I believe the issue might come from the fact that it is looking at the build commit (d26ca66) and comparing it to the previous build's commit (4780c80). It doesn't find any files, as the previous build commit (4780c8...) doesn't exist anymore, as I force pushed the branch, overwriting that commit's data.

Maybe checking for empty results in the diff would help in that regard?

OK, so this is another edge case. So this pipeline thing is totally borked :( Why can't it give us a normal commit range of the PR.

So let's comment out circleci step halt and I will work on take #3 that will be much more elaborate. Should I do it or will you? I'm not sure if it's ok to commit directly.

@LysandreJik
Copy link
Member

You can comment it out! Thanks!

@stas00
Copy link
Contributor Author

stas00 commented Dec 1, 2020

So the proposed logic for take 3 will be:

  1. if pipeline.git.base_revision and pipeline.git.revision are defined
  2. if git diff --name-only range returns anything
  3. if what it returned in 2 is just docs
  4. then skip

stas00 added a commit that referenced this pull request Dec 1, 2020
@stas00
Copy link
Contributor Author

stas00 commented Dec 1, 2020

You can comment it out! Thanks!

Done.

@stas00
Copy link
Contributor Author

stas00 commented Dec 1, 2020

Oh I always open PRs like that, so it was indeed the culprit for the two other times we saw that happen.
That does not make sense since the I open the PR on GitHub but the CI runs on a commit. So forget I said anything!

I'm not sure what you're saying - I think the point is that CircleCI can't find the branching point when the change is done via github file edit.

Note that git diff --name-only $(git merge-base --fork-point master) doesn't work on CirlceCI - otherwise we would have figured out the range ourselves.

@sgugger
Copy link
Collaborator

sgugger commented Dec 1, 2020

Yes I don't do commit by editing files on GitHub, just the PR part, that's why I scratched what I was saying.

@stas00
Copy link
Contributor Author

stas00 commented Dec 1, 2020

Thank you for clarifying that, @sgugger. Then this lack of pipeline.git.base_revision appears to be random then.

stas00 added a commit to stas00/transformers that referenced this pull request Dec 5, 2020
* restore skip

* Revert "Remove deprecated `evalutate_during_training` (huggingface#8852)"

This reverts commit 5530299.

* check that pipeline.git.base_revision is defined before proceeding

* Revert "Revert "Remove deprecated `evalutate_during_training` (huggingface#8852)""

This reverts commit dfec84d.

* check that pipeline.git.base_revision is defined before proceeding

* doc only

* doc + code

* restore

* restore

* typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants