tests/integration: data store mgr tests unstable #4175

oliver-sanders · 2021-04-15T10:27:34Z

The data store mgr tests or a subset thereof sometimes fail and sometimes pass.

Some example traceback:

    def test_update_data_structure(harness):
        """Test update_data_structure. This method will generate and
        apply adeltas/updates given."""
        schd, data = harness
        w_id = schd.data_store_mgr.workflow_id
        schd.data_store_mgr.data[w_id] = data
        assert TASK_STATUS_FAILED not in set(collect_states(data, TASK_PROXIES))
        assert TASK_STATUS_FAILED not in set(collect_states(data, FAMILY_PROXIES))
        assert TASK_STATUS_FAILED not in data[WORKFLOW].state_totals
>       assert len({t.is_held for t in data[TASK_PROXIES].values()}) == 2
E       assert 0 == 2
E         +0
E         -2

The test file:

tests/integration/test_data_store_mgr.py

The text was updated successfully, but these errors were encountered:

dwsutherland · 2021-04-21T05:24:23Z

Have run it twenty times locally and no failure... I will see if there's a more robust way to test the update.

dwsutherland · 2021-04-21T05:49:04Z

I'd have to understand how the test harness works at a deeper level to know why it's inconsistent..

i.e. why does this:

@pytest.mark.asyncio
@pytest.fixture(scope='module')
async def harness(mod_flow, mod_scheduler, mod_run):
    flow_def = {
        'scheduler': {
            'allow implicit tasks': True
        },
        'scheduling': {
            'graph': {
                'R1': 'foo => bar'
            }
        }
    }
    reg: str = mod_flow(flow_def)
    schd: 'Scheduler' = mod_scheduler(reg)
    async with mod_run(schd):
        schd.pool.hold_tasks('*')
        schd.resume_workflow()
        # Think this is needed to save the data state at first start (?)
        # Fails without it.. and a test needs to overwrite schd data with this.
        data = schd.data_store_mgr.data[schd.data_store_mgr.workflow_id]
        yield schd, data

Sometimes yield the scheduler with no task proxies? Did the tasks not exist when the hold command was issued?

Hard to tell unless I can reproduce the issue somehow.. (the test in question still worked when I commented out the hold)

dwsutherland · 2021-04-21T06:01:24Z

I think the same harness yield gets shared amongst the tests..
Whenever I changes the order of the tests, it appears to effect what data the next tests are working with.

So perhaps create another harness object (if possible), and see if it occurs again ... Again, easier if I could reproduce..

dwsutherland · 2021-04-21T06:08:13Z

On a side note, this also works:

(flow) sutherlander@cortex-vbox:cylc-flow$ git diff
diff --git a/tests/integration/test_data_store_mgr.py b/tests/integration/test_data_store_mgr.py
index 2d6d584cc..533c30a60 100644
--- a/tests/integration/test_data_store_mgr.py
+++ b/tests/integration/test_data_store_mgr.py
@@ -259,8 +259,8 @@ def test_update_data_structure(harness):
     schd.data_store_mgr.data[w_id] = data
     assert TASK_STATUS_FAILED not in set(collect_states(data, TASK_PROXIES))
     assert TASK_STATUS_FAILED not in set(collect_states(data, FAMILY_PROXIES))
-    assert TASK_STATUS_FAILED not in data[WORKFLOW].state_totals
-    assert len({t.is_held for t in data[TASK_PROXIES].values()}) == 2
+    assert data[WORKFLOW].state_totals[TASK_STATUS_FAILED] == 0
+    assert sum(data[WORKFLOW].state_totals.values()) == 2
     for itask in schd.pool.get_all_tasks():
         itask.state.reset(TASK_STATUS_FAILED)
         schd.data_store_mgr.delta_task_state(itask)

So I don't think the key lookup is working..
I did this because the manager changed to reset the state totals for families being updated:

            # Use all states to clean up pruned counts
            for state in TASK_STATUSES_ORDERED:
                fp_delta.state_totals[state] = state_counter.get(state, 0)
            fp_updated.setdefault(fp_id, PbFamilyProxy()).MergeFrom(fp_delta)

oliver-sanders · 2021-04-21T09:14:29Z

Sometimes yield the scheduler with no task proxies?

Ah, perhaps we have to explicitly get the pool to release tasks, otherwise there is a race condition between the async with and the async def main_loop?

dwsutherland · 2022-01-17T09:44:56Z

I assume this is still an issue?

oliver-sanders · 2022-01-17T10:28:53Z

I still run into this one, it seems to happen more often with a higher level of parallelism in the tests. Still have no idea why it's happening.

oliver-sanders · 2022-01-31T12:02:22Z

Was playing with a small change to the integration tests which caused these tests to break badly so took a look in.

I think the problem is:

The main loop is running whilst the tests are running which means the scheduler can do things in-between tests, or not, depending on how asyncio manages tasks which means the data store may be updated, or not between tests. This would explain why I see these failures a lot, whereas we don't hit them much on GHA (because I parallelise the hell out of the tests!).
The tests all share the same workflow, but the tests also modify the workflow and don't undo their changes which means the tests can interact.

I've addressed (1) on my branch by adding a new fixture that lets us start scheduler without running the main loop to give these tests a fighting chance.

I've fixed the resulting failures bar one, which I don't understand in test_update_data_structure:

# Shows pruning worked
assert len({t.is_held for t in data[TASK_PROXIES].values()}) == 1

The test resets a task state to failed, then expects the number of is_held tasks to change as a result? Commented out for now.

oliver-sanders · 2022-01-31T17:55:47Z

Done (1) in #4620, may have helped, however, I still get flaky failures ~50% of the time for:

test_delta_task_held
test_update_data_structure

With high parallelism.

dwsutherland · 2022-10-21T00:20:17Z

Well, not sure what to do about (2) other than have tests operate on different workflows, as it's not something that would happen in the wild..

Also I'm not sure the tests could "undo" their changes,
a) wouldn't parallelism still cause issues while undoing?
b) the cheapest way to undo would be to reload/reinitialize the data-store, assuming other tests aren't running at the time.

oliver-sanders added this to the cylc-8.0.0 milestone Apr 15, 2021

dwsutherland self-assigned this Apr 19, 2021

oliver-sanders mentioned this issue Jul 7, 2021

Remove old "edit" and "search" commands. #4291

Merged

7 tasks

hjoliver modified the milestones: cylc-8.0.0, cylc-8.x Aug 4, 2021

oliver-sanders mentioned this issue Jan 31, 2022

queues: fix interactions with the scheduler paused and task held states #4620

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests/integration: data store mgr tests unstable #4175

tests/integration: data store mgr tests unstable #4175

oliver-sanders commented Apr 15, 2021

dwsutherland commented Apr 21, 2021

dwsutherland commented Apr 21, 2021 •

edited

Loading

dwsutherland commented Apr 21, 2021 •

edited

Loading

dwsutherland commented Apr 21, 2021 •

edited

Loading

oliver-sanders commented Apr 21, 2021

dwsutherland commented Jan 17, 2022

oliver-sanders commented Jan 17, 2022

oliver-sanders commented Jan 31, 2022

oliver-sanders commented Jan 31, 2022 •

edited

Loading

dwsutherland commented Oct 21, 2022

tests/integration: data store mgr tests unstable #4175

tests/integration: data store mgr tests unstable #4175

Comments

oliver-sanders commented Apr 15, 2021

dwsutherland commented Apr 21, 2021

dwsutherland commented Apr 21, 2021 • edited Loading

dwsutherland commented Apr 21, 2021 • edited Loading

dwsutherland commented Apr 21, 2021 • edited Loading

oliver-sanders commented Apr 21, 2021

dwsutherland commented Jan 17, 2022

oliver-sanders commented Jan 17, 2022

oliver-sanders commented Jan 31, 2022

oliver-sanders commented Jan 31, 2022 • edited Loading

dwsutherland commented Oct 21, 2022

dwsutherland commented Apr 21, 2021 •

edited

Loading

dwsutherland commented Apr 21, 2021 •

edited

Loading

dwsutherland commented Apr 21, 2021 •

edited

Loading

oliver-sanders commented Jan 31, 2022 •

edited

Loading