Fix randomly failing test `test_enqueue_once_after_enqueue` #17950

jeremystretch · 2024-11-07T13:56:38Z

Proposed Changes

The test netbox.tests.test_jobs.EnqueueTest.test_enqueue_once_after_enqueue occasionally fails for an unknown reason (see this example). This needs to be investigated and resolved.

Justification

CI tests should always pass reliably.

The text was updated successfully, but these errors were encountered:

bctiemann · 2024-11-13T12:40:16Z

The exception is being raised from here, in rq/job.py --

        if refresh:
            status = self.connection.hget(self.key, 'status')
            if not status:
                raise InvalidJobOperation(f"Failed to retrieve status for job: {self.id}")
            self._status = JobStatus(as_text(status))

self.connection at that point when running locally is a redis.client.Redis instance. What is the setup in CI? Does it have Redis available? Or does this connection need to be mocked?

It seems like whatever caching backend is present in CI is intermittently failing to return a result for the key of the job being enqueue_once'd:

        job1 = TestJobRunner.enqueue(instance, schedule_at=self.get_schedule_at())
        job2 = TestJobRunner.enqueue_once(instance, schedule_at=self.get_schedule_at(2))

            # If the job parameters haven't changed, don't schedule a new job and keep the current schedule. Otherwise,
            # delete the existing job and schedule a new job instead.
            if (schedule_at and job.scheduled == schedule_at) and (job.interval == interval):
                return job
            job.delete()

We are trying to delete the job (because it is being called with new parameters), but its key is not in the cache when the above rq code is called, so it raises the exception. Maybe this is because the first instance of the job has already completed by the time the second one is enqueued? But I would think the key would still be present and it would have a status of finished, rather than not being there at all.

Since I have Redis in my local environment, this always works properly, but I suspect the setup in CI is different.

jsenecal · 2024-11-13T14:20:48Z

HA! It was driving me mad yesterday, thanks for flagging this @jeremystretch

…st (#18062) * Wait until job1 exists in Redis before enqueueing job2 * Job can exist but not have status * Catch InvalidJobOperation and use as trigger for retry * Catch InvalidJobOperation when deleting/canceling job * Remove testing code

jeremystretch added type: housekeeping Changes to the application which do not directly impact the end user status: needs owner This issue is tentatively accepted pending a volunteer committed to its implementation labels Nov 7, 2024

This was referenced Nov 7, 2024

Closes #16903: Update release process to use Transifex CLI client #17916

Merged

Closes: #15239 - Allow adding/removing tagged VLANs in bulk editing of Interfaces #17524

Merged

jmcguir mentioned this issue Nov 8, 2024

fix #17934 adding fixed 1000based-LX #17968

Closed

jeremystretch added the netbox label Nov 19, 2024 — with Linear

jeremystretch mentioned this issue Nov 19, 2024

Closes: #17795 - Add concurrency to CI #18042

Merged

jeremystretch assigned bctiemann Nov 21, 2024

jeremystretch added status: accepted This issue has been accepted for implementation and removed status: needs owner This issue is tentatively accepted pending a volunteer committed to its implementation labels Nov 21, 2024

bctiemann mentioned this issue Nov 21, 2024

Fixes: #17950 - Handle InvalidJobOperation error in job enqueueing test #18062

Merged

jeremystretch closed this as completed in #18062 Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix randomly failing test `test_enqueue_once_after_enqueue` #17950

Fix randomly failing test `test_enqueue_once_after_enqueue` #17950

jeremystretch commented Nov 7, 2024

bctiemann commented Nov 13, 2024

jsenecal commented Nov 13, 2024

Fix randomly failing test test_enqueue_once_after_enqueue #17950

Fix randomly failing test test_enqueue_once_after_enqueue #17950

Comments

jeremystretch commented Nov 7, 2024

Proposed Changes

Justification

bctiemann commented Nov 13, 2024

jsenecal commented Nov 13, 2024

Fix randomly failing test `test_enqueue_once_after_enqueue` #17950

Fix randomly failing test `test_enqueue_once_after_enqueue` #17950