Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add check_mode in localhost binary cache task #431

Merged
merged 1 commit into from
Oct 18, 2024

Conversation

santilococo
Copy link
Contributor

@santilococo santilococo commented Oct 18, 2024

This PR adds the check_mode: false in the localhost binary cache task.

This is needed because the next task, which downloads the prometheus binary, expects the folder to already be created. Without check_mode: false, the task only runs in "simulation" mode and doesn't create the folder.

This code was introduced in the most recent PR #425: https://github.com/prometheus-community/ansible/pull/425/files#diff-c9a8a7fbd7ee9c40d13e4be2e88801e621b943a2246b78ad46b93dc0f7c33971R38

You can see that this task did not exist before: https://github.com/prometheus-community/ansible/pull/425/files#diff-6f9278158452a81913bff59d794daa00e142796b04e75efd873c4a5dbb7f2899L38

Fixes #430

Signed-off-by: Santiago Lo Coco <santilococo.01@gmail.com>
@erikgb
Copy link

erikgb commented Oct 18, 2024

LGTM! Thanks @santilococo!

@erikgb
Copy link

erikgb commented Oct 18, 2024

Can we get this important bugfix reviewed, merged, and released ASAP?

@voidquark
Copy link

Same problem on our side. LGTM 🚀

@SuperQ SuperQ requested a review from gardar October 18, 2024 12:51
Copy link
Contributor

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@SuperQ
Copy link
Contributor

SuperQ commented Oct 18, 2024

Test errors are due to github rate limits. Will retry them.

@SuperQ
Copy link
Contributor

SuperQ commented Oct 18, 2024

@gardar do we really need to run all the role tests when updating _common?

@gardar
Copy link
Member

gardar commented Oct 18, 2024

@gardar do we really need to run all the role tests when updating _common?

Yeah since it's affecting all the roles I'm afraid it's the only way to catch potential errors.
But I agree, the number of tests is getting pretty ridiculous 😅, I think we need to tone it down a bit.

@SuperQ
Copy link
Contributor

SuperQ commented Oct 18, 2024

I think for _common we can get enough coverage by just testing the prometheus role.

@oc
Copy link

oc commented Oct 18, 2024

Great, thanks! LGTM!

@erikgb
Copy link

erikgb commented Oct 18, 2024

Maybe in a follow-up PR, but could it make sense to mark all these download-related tasks as changed: false. In our container-based Ansible controller setup, these tasks are always marked as changed - which is correct. But pretty annoying when looking at the summary of the play.

It also appears that Prometheus is always restarted (by running the restart handler). I think this is caused by this unnecessary "changed" tasks, but not 100% sure. I will monitor this bad behavior when a new release is out.

@gardar
Copy link
Member

gardar commented Oct 18, 2024

Maybe in a follow-up PR, but could it make sense to mark all these download-related tasks as changed: false. In our container-based Ansible controller setup, these tasks are always marked as changed - which is correct. But pretty annoying when looking at the summary of the play.

No I don't think that's a good idea to fake the "changed" status like that, you should be able to see when the download is actually being executed.
I'd suggest you use a persistent volume for the local cache path instead, it should save you some download time and you would avoid being rate limited by github.

It also appears that Prometheus is always restarted (by running the restart handler). I think this is caused by this unnecessary "changed" tasks, but not 100% sure. I will monitor this bad behavior when a new release is out.

No the download tasks don't notify the handlers, so there must be some other task that's returning the changed status.

@erikgb
Copy link

erikgb commented Oct 18, 2024

Maybe in a follow-up PR, but could it make sense to mark all these download-related tasks as changed: false. In our container-based Ansible controller setup, these tasks are always marked as changed - which is correct. But pretty annoying when looking at the summary of the play.

No I don't think that's a good idea to fake the "changed" status like that, you should be able to see when the download is actually being executed. I'd suggest you use a persistent volume for the local cache path instead, it should save you some download time and you would avoid being rate limited by github.

We have a proxy in-between, since GitHub is not directly available. Semi air-gapped environment. I think changes to the Ansible controller should be safe to ignore. But point taken, we can agree to disagree. 😉

A persistent disk is not a feasible alternative for us. Try getting that on GitLab Docker CI/CD runners!

It also appears that Prometheus is always restarted (by running the restart handler). I think this is caused by this unnecessary "changed" tasks, but not 100% sure. I will monitor this bad behavior when a new release is out.

No the download tasks don't notify the handlers, so there must be some other task that's returning the changed status.

Thanks! I will debug this when a new release is available.

@gardar
Copy link
Member

gardar commented Oct 18, 2024

A persistent disk is not a feasible alternative for us. Try getting that on GitLab Docker CI/CD runners!

We are getting way off topic here, but both github/gitlab offer cache features which you could probably use for that. If you figure out how to do it then it might be useful to add it do the docs 😄
https://docs.gitlab.com/ee/ci/caching/
https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/caching-dependencies-to-speed-up-workflows#using-the-cache-action

@Frazew
Copy link

Frazew commented Oct 18, 2024

No I don't think that's a good idea to fake the "changed" status like that, you should be able to see when the download is actually being executed.

I'm jumping in because I wanted to suggest the exact same check_mode: false fix and found this PR. However I would also argue that changed: false should be put on this folder creation and download tasks. Otherwise the role is not very clearly idempotent: running it twice still shows tasks as changed even though nothing changed on the remote host.

imo these downloads are anyway local prerequisites for the role to run correctly, they're not indicating any changed action on the remote host, that's the job of the Propagate binaries task

@gardar gardar merged commit 0e58f43 into prometheus-community:main Oct 18, 2024
611 of 619 checks passed
Copy link
Contributor

Docs Build 📝

Thank you for contribution!✨

This PR has been merged and the docs are now incorporated into main:
https://prometheus-community.github.io/ansible/branch/main

@Frazew
Copy link

Frazew commented Oct 18, 2024

Just a quick comment here to say that I opened #433 to move the changed_when discussion there, I've tried to summarize the various arguments for/against, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment