-
-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ark doesn't handle failed Chef run correctly #96
Comments
Without the the Taking action every time is not very "test and repair" friendly. But what you point is absolutely correct. If there is a failure during any of the steps the whole system is broken unless:
I am sure that I could capture the state of the resources and look to see if they are not successful. If one were to fail I imagine that I could:
Either of them are not great. The first one seems messy. The second requires querying and reacting to issues with resources. Do you have any additional thoughts? This is similar to issue #100. |
It turns out that Chef makes this problem really hard to fix. Based on the idea of using marker files from https://github.com/mkocher/chef_deploy/blob/master/chef/cookbooks/joy_of_cooking/libraries/marker.rb, then something like this almost works:
"Test and repair" friendly:
But there are problems. The ideal solution to solve this problem would be to use a "before" notification, which has been discussed but not included in Chef. With a before notification, once |
While trying to fix #93 I discovered that ark can leave a server in an unconfigured state if Chef halts during convergence. Here is the abbreviated code for the "dump" provider that shows why:
If the Chef run halts immediately after
remote_file
then the next time Chef runs it will detect that the file is already downloaded and skip theexecute
andset owner
steps.Also, in the event a bad sysop manually sets the file owner then the owner won't get reset during the Chef run as it should.
So I guess the underlying question is why use
action :nothing
and notify the next resource in the chain instead of simply usingaction :<default>
for each action?The text was updated successfully, but these errors were encountered: