Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #40, move threader shutdown after all work has completed #42

Closed
wants to merge 1 commit into from

Conversation

dannyk81
Copy link
Collaborator

@pryorda @Jc2k this fixes #40 for me

please take a look

@Jc2k
Copy link
Collaborator

Jc2k commented Dec 21, 2018

The threader shutdown is where it was because there is a dependency between the vm metrics and the vm perf counters (it assumes self._labels is populated already). So this is probably introducing a non deterministic race 😞

@dannyk81
Copy link
Collaborator Author

Ohhh... I see it now.

Well, looks like using the threader after it was already shutdown is causing some issues.

Perhaps the dependency can be more explicitly defined? maybe using separate execute contexts?

@Jc2k
Copy link
Collaborator

Jc2k commented Dec 21, 2018

Ahh! Because you can’t submit jobs after shutdown is called! Of course!

Well I can probably get rid of the nested threads all together with a property collector (it can fetch all data for all hosts and data stores in one call). Will try and look at doing that.

@dannyk81
Copy link
Collaborator Author

That'd be great @Jc2k, I'm going to close this PR on account that this fix is not safe.

@pryorda FYI though that v0.3.0 is currently broken for hosts and datastores, but with @Jc2k's help we should be able to fix this.

@dannyk81 dannyk81 closed this Dec 21, 2018
@pryorda
Copy link
Owner

pryorda commented Dec 21, 2018

@Jc2k @dannyk81 Good finds. I'm surprised we didn't notice this in the PR. I think pushing the shutdown lower down should be fine. @Jc2k Can you elaborate on the Race Condition you're worried about?

@pryorda
Copy link
Owner

pryorda commented Dec 21, 2018

@dannyk81 I don't want to close this just yet. This might be the immediate fix, but not the log term solution.

@pryorda pryorda reopened this Dec 21, 2018
@pryorda
Copy link
Owner

pryorda commented Dec 21, 2018

I think it will be beneficial to add #41 and #42 to a test branch before merging to v0.3.1 and i'll get that done asap for testing. 1-2hr

@pryorda
Copy link
Owner

pryorda commented Dec 21, 2018

Release updated: https://github.com/pryorda/vmware_exporter/releases/tag/v0.3.0

@Jc2k
Copy link
Collaborator

Jc2k commented Dec 21, 2018

I think that this PR is unsafe, and is trading one race for another. After this PR is merged the extra code that runs early depends on self._labels. That structure is generated on a thread. The current way ensures that structure is populated before the code that uses it runs. The new way allows it to run before it exists.

@dannyk81
Copy link
Collaborator Author

@pryorda I didn't see the issue that pushing down the shutdown was creating, but this fix as-is creates another problem.

I suggest to close this and work on a fix which will reconcile both races properly.

@pryorda
Copy link
Owner

pryorda commented Dec 21, 2018

We should able to force a thread completion before moving on. I think you're right in saying that maybe we should remove some of the nested threading

@pryorda
Copy link
Owner

pryorda commented Dec 21, 2018

Do we want to continue the conversation here or move it to #41 and #40?

@dannyk81
Copy link
Collaborator Author

I suggest we close this one and discuss this over in #40

@pryorda pryorda closed this Dec 21, 2018
@pryorda pryorda deleted the fix_40 branch January 22, 2019 05:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants