Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent job refactor #371

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft

Agent job refactor #371

wants to merge 9 commits into from

Conversation

boukeas
Copy link
Contributor

@boukeas boukeas commented Oct 4, 2024

Description

This PR is a refactoring of the main Testflinger agent loop, i.e. the TestflingerAgent.process_jobs method which controls how an agent picks up a job and controls the execution of its individual phases.

A lot of functionality has been abstracted away from the TestflingerAgent.process_jobs method into the TestflingerJob class and the newly introduced JobPhase class and its derived classes.

Note to reviewers

The structural changes are significant and focusing narrowly on the diff in order to review this sort of refactoring would be disorientating and make little sense. Please read and review TestflingerAgent.process_jobs and the job.py module in their entirety.

Resolved issues

Documentation

Web service API changes

Tests

Copy link
Collaborator

@plars plars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one spot that's been begging for a refactor for a while now. I really like your idea and where you are taking it, thanks for taking this on!
I know it's still draft and some obvious things. I was actually able to get it running through some phases in some local testing. I just set up some fake commands for it to run and had it attach to a server I was running in docker locally, then submitted a job to that server for it to pick up. I did find one corner case because it was a new agent and didn't have the directory structure in place for (results, logs, run...). There's an os.makedirs() you removed that I think we'll either need to keep, or handle somewhere to ensure that run dir exists before we try to drop the job file into it.
Looking good so far!

rundir = os.path.join(
self.client.config.get("execution_basedir"), job.job_id
)
os.makedirs(rundir)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really done testing this, but wanted to get started on it... I did see this so far. Because it never forces creation of this dir, if it's missing, then it'll result in problems later:

[24-10-08 19:01:02]   ERROR: (cmd.py:32)| cannot access local variable 'job' where it is not associated with a value
Traceback (most recent call last):
  File "/home/plars/work/cert/testflinger/agent/testflinger_agent/cmd.py", line 27, in main
    start_agent()
  File "/home/plars/work/cert/testflinger/agent/testflinger_agent/__init__.py", line 139, in start_agent
    agent.process_jobs()
  File "/home/plars/work/cert/testflinger/agent/testflinger_agent/agent.py", line 159, in process_jobs
    if job.go(phase):
       ^^^
UnboundLocalError: cannot access local variable 'job' where it is not associated with a value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it never forces creation of this dir, if it's missing

The rundir is created here, when the job object is __init__ed.

That error is raised in the finally clause because the job object was never properly created so there should be an indication in the agent log about what the actual error was. I just don't think it's the rundir not being created.

agent/testflinger_agent/tests/test_agent.py Outdated Show resolved Hide resolved
agent/testflinger_agent/job.py Outdated Show resolved Hide resolved
agent/testflinger_agent/job.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants