Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove activity definition cache #118

Merged
merged 2 commits into from
Jul 25, 2023

Conversation

cretz
Copy link
Member

@cretz cretz commented Jul 25, 2023

What was changed

Removed the activity definition cache. This cache was only useful in the past, before we used lambda expressions, to be able to repeatedly get the name and types at the call site. But now with lambda expressions we just get the name and assume that a compiled lambda means arg/result types are accurate. Also, the cache wasn't changed when we started storing custom invokers so the instance from the first activity definition was inadvertently reused.

Checklist

  1. Closes [Bug] Activity definition cache reuses instance #117

@cretz cretz requested a review from a team July 25, 2023 11:51
Copy link
Contributor

@dandavison dandavison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of non-blocking questions/comments but LGTM! (I believe you confirmed in Slack that this test failed with the cache in place.)

},
new TemporalWorkerOptions().AddAllActivities(new DuplicateActivities("instance1")));
}

Copy link
Contributor

@dandavison dandavison Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explaining this test to myself:

  1. We define a workflow that returns a result containing the results of two activity calls.

  2. The two activities are calls to the same activity method but on distinct instances (and using distinct task queues)

  3. The utility ExecuteWorkerAsync allows us to start a worker and, while it's running, execute an arbitrary function that has access to the worker instance. We use it twice:

  4. The first usage starts the first worker and uses the function to make the second nested call to ExecuteWorkerAsync

  5. The second usage starts a second worker and uses the function to execute our workflow and check that the result shows that distinct activity methods were invoked.

Copy link
Contributor

@dandavison dandavison Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're using the lambda passed to ExecuteWorkerAsync to essentially play the role of customer application (non-Worker) code, since it starts a workflow. Therefore it's a bit artificial that the function has access to an actual instance of the worker (this confused me initially). But it looks like the pass-a-function-that-receives-a-worker-instance is a deeper part of the non-test-code SDK design.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bingo. Before I removed the cache, this test would fail because the result of the workflow would be ["instance1","instance1"] because no matter which activity instance you gave to a worker, it used the first cached one based on method which was bad. Now it returns ["instance1","instance2"] as expected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're using the lambda passed to ExecuteWorkerAsync to essentially play the role of customer application (non-Worker) code, since it starts a workflow.

Yes

Therefore it's a bit artificial that the function has access to an actual instance of the worker (this confused me initially).

Mostly. It's just so I can get it to create that random task queue which I can use

But it looks like the pass-a-function-that-receives-a-worker-instance is a deeper part of the non-test-code SDK design.

Basically you have a worker that can run for some amount of time. In non-test code we support a worker running forever, until cancellation token is triggered, and/or until the completion of an async function. In test code, I made ExecuteWorkerAsync which is just a shortcut for the latter that also adds a workflow and makes a random task queue and does a couple of other things and gives the worker as the parameter just in case the test function wants to do something with it (most just use the task queue).

}),
};
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this test use two workers with different task queues?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only needs the activities on different workers not really the workflow (but I just reuse the execute worker helper). We use two workers to confirm that the same activity definition but with different instances for different workers returns different values.

So basically this workflow calls the same activity on different task queues and the test confirms that the activity called was the different instances on different task queues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels to me that the assertion being made is about what happens when we have distinct activity method instances and so it's not obvious that testing it requires multiple workers and task queues.

Copy link
Member Author

@cretz cretz Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is specifically "what happens when a workflow calls the same activity method but on different task queues where the different activity instances may have different state". This requires multiple task queues (task queues are 1:1 with workers, so it requires multiple workers).

},
new TemporalWorkerOptions().AddAllActivities(new DuplicateActivities("instance2")));
},
new TemporalWorkerOptions().AddAllActivities(new DuplicateActivities("instance1")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this test is written in a point-free style makes it a bit hard to read because the test assertion makes reference to "instance1" and "instance2" before the reader has encountered them.

Copy link
Member Author

@cretz cretz Jul 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the consequence of lambdas. Parameters to methods accepting lambdas are invoked and often used before the lambda is invoked. We make the parameters after the lambda because they are optional and many don't use them. This is test code so it is an acceptable tradeoff IMO.

@cretz cretz merged commit 0cded9b into temporalio:main Jul 25, 2023
@cretz cretz deleted the activity-definition-cache-fix branch July 25, 2023 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Activity definition cache reuses instance
2 participants