Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project Specific Test Probes #3009

Closed
lfield opened this issue Feb 7, 2019 · 4 comments
Closed

Project Specific Test Probes #3009

lfield opened this issue Feb 7, 2019 · 4 comments

Comments

@lfield
Copy link
Contributor

lfield commented Feb 7, 2019

What I would like to do be able to do is to write a test probe that can be run to detect the sanity of the host before a task is downloaded. For example, testing that VritualBox is available. If this probe fails, no task will be downloaded. These probes would be project specific. It is to avoid situations where tasks fail due to a known and detectable issues with the host that are application specific . I would be happy to provide a PR is we can agree on the solution.

@davidpanderson
Copy link
Contributor

Another approach would be to have the validator check for specific error messages or exit codes, and update the host record (or some new table) in a way that prevents more jobs of that type from being sent to the host.

@lfield
Copy link
Contributor Author

lfield commented Feb 8, 2019

Is the validator run against results that returned an error?

@davidpanderson
Copy link
Contributor

No; we'd have to add an option.

The other question is where to store the info.
The natural place is the host_app_version table.
This has a field max_jobs_per_day that could be used;
e.g. -1 means don't use this app version for this host.
I'd have to make some small changes for this.

@lfield
Copy link
Contributor Author

lfield commented Feb 14, 2019

Yes, reducing this to one per day should be fine. Related to this is that I am using the wrapper and my script returns 206 if there is a problem. However, it looks like this is not being used and the wrapper is returning 195 EXIT_CHILD_FAILED. Another feature would be to reset the max_jobs_per_day for a specific host. This means that a miss configured node could be removed for a day but if a person is trying to fix the problem they can still get new tasks when they need to test.

So yes having something where we can parse the stderr for error messages and set the max_jobs_per_day to 1 would be sufficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants