Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User type centric dispatcher #2568

Closed
wants to merge 11 commits into from
Closed

Conversation

mgor
Copy link
Contributor

@mgor mgor commented Jan 29, 2024

A new dispatcher, where applicable, is compatible with the "old" UserDispatcher when fixed_count is used. The old dispatcher has been renamed WeightedUserDispatcher, which also is used as the default one.

The new FixedUsersDispatcher main difference is that target user count is a dict with the number of user of each type, and not the target total user count.

Another feature is being able to assign user types a [sticky] tag, which will be used to pin certain user types to a set of workers, when running distributed.

The tests in test_dispatcher.py has been modified to test both implementations, unless the test case was explicitly testing weights.

For now, there's no way to select which dispatcher to use other than programmatically.

We have been running with the FixedUsersDispatcher in our internal project for almost a month, without any major issues.

mgor added 11 commits December 29, 2023 08:21
more compatability with (Weighted)UsersDispatcher, as proven with the existing unit tests.
mostly comments / doc-strings. more test coverage.
also support [target_]_user_count as either an int or dict.

fixed typing errors in fasthttp
# Conflicts:
#	locust/contrib/fasthttp.py
#	locust/dispatch.py
#	locust/env.py
#	locust/runners.py
#	locust/shape.py
#	locust/test/test_dispatch.py
#	locust/user/users.py
fix ruff and mypy errors after merge
formatted files with ruff and fixed additional mypy errors
based on number of users per sticky tag, at least 1 worker per sticky tag.
@cyberw
Copy link
Collaborator

cyberw commented Jan 31, 2024

I'll get back to you, dont have time to review right now, but I havent forgotten :)

self._state = None
self._greenlet: greenlet.Greenlet = None
self._state: str | None = None
self._greenlet: greenlet.Greenlet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is a safe change? I havent checked, but usually things are there for a reason...

@@ -1206,7 +1230,7 @@ def on_report_to_master(client_id: str, data: dict[str, Any]):
self.environment.events.report_to_master.add_listener(on_report_to_master)

# register listener that sends quit message to master
def on_quitting(environment: Environment, **kw) -> None:
def on_quitting(environment: Environment, **kw: Any) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the Any needed? Type hinting stuff that the method never uses is kinda meaningless..

worker_nodes=list(self.clients.values()), user_classes=self.user_classes
)

_user_count = self._users_dispatcher.get_target_user_count() if isinstance(user_count, dict) else user_count
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you think of a better name than _user_count? This code is very hard to read.

@@ -84,7 +84,7 @@ class StatsErrorDict(StatsBaseDict):

class StatsHolder(Protocol):
name: str
method: str
method: str | None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this become optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Total / aggregated statistics is created with method set to None.

self.total = StatsEntry(self, "Aggregated", None, use_response_times_cache=self.use_response_times_cache)

It was needed to solve a type issue for some of the sorting logic.

However, I see now that it should then probably be changed in StatsEntry as well, then,

def __init__(self, stats: RequestStats | None, name: str, method: str, use_response_times_cache: bool = False):

@@ -481,7 +493,9 @@ def _start(self, user_count: int, spawn_rate: float, wait: bool = False, user_cl
self.environment._filter_tasks_by_tags()
self.environment.events.test_start.fire(environment=self.environment)

if wait and user_count - self.user_count > spawn_rate:
_user_count = sum(user_count.values()) if isinstance(user_count, dict) else user_count
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue here with the variable naming and readability.

@@ -589,7 +594,8 @@ def start_automatic_run():
elif not options.worker and not environment.shape_class:
logger.info("No run time limit set, use CTRL+C to interrupt")

if options.csv_prefix:
# `stats_csv_writer` is initialized with a StatsCSVFileWriter object if options.csv_prefix is "True"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type hinting this makes it clear how confusing it is. maybe add a comment that it should be fixed some time?

@@ -356,7 +360,7 @@ def kill_workers(children):
locustfile_path = None if not locustfile else os.path.basename(locustfile)

environment = create_environment(
user_classes,
selected_user_classes,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the rename?

user_class.fixed_count,
)


class TestRampUpDifferentUsers(unittest.TestCase):
def test_ramp_up_different_users_for_each_dispatch(self):
@parameterized_class(PARAMETER_DISPATCHERS)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to duplicate this test rather than parameterizing it. Too many if statements for it to make sense as parameterized.


# user count = 1
users_dispatcher.new_dispatch(target_user_count=1, spawn_rate=1)
users_dispatcher.new_dispatch(target_user_count=target_user_count(1), spawn_rate=1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh. What? target_user_count(1) instead of just 1? I dont understand it and I'm pretty sure if I did make the effort to understand it I wouldnt like it :)

@cyberw
Copy link
Collaborator

cyberw commented Feb 3, 2024

Hmm.... I'd like to see an example locustfile using this as well as a mention of this in the documentation. And could you separate the type checks that you have added into a separate PR? (I like them, but not as part of a functional change)

Would this work together with the newly added --config-users/weight setting in the UI? (#2559)

@mgor
Copy link
Contributor Author

mgor commented Feb 3, 2024

I'll be gone on vacation for a week, I'll get back regarding your comments after that 🙂

@cyberw
Copy link
Collaborator

cyberw commented Feb 3, 2024

Hmm. I only now got around to reviewing the actual dispatch code. Can you do something about the diff, it is kinda unreadable as is. When I think about it - does this dispatcher actually change anything if all users have fixed_count instead of weights? (because that is what I meant could be an improvement, in the discussion we had before)

I think it is unlikely that we'll be able to merge this. Its just too much code for a niche case...

@mgor
Copy link
Contributor Author

mgor commented Feb 12, 2024

Before spending time on resolving the comments, I think we should resolve this first :)

When I think about it - does this dispatcher actually change anything if all users have fixed_count instead of weights? (because that is what I meant could be an improvement, in the discussion we had before)

Dispatch-logic-wise, without using the new sticky_tag there's not really a big difference. Usage wise the big difference is that you don't specify a total number of users to spawn, but rather you specify per user-type how many you want to spawn. Which was what I thought you meant with in #2522:

Not based on specifying weights and a total user count but on per-User type-numbers (like most other load test tools do it).

@cyberw
Copy link
Collaborator

cyberw commented Feb 12, 2024

Tbh, I dont have time to review this again, sorry. The diff is just too big.

I did have a short look and notice that it solves things in a way I wouldnt want anyway: Tagging workers at startup (due to different types of workers etc) might make sense, but having the master just assign tags arbitrarily is an even more rare use case.

Perhaps your underlying need could be solved by something else, like passing custom messages?

Not based on specifying weights and a total user count but on per-User type-numbers (like most other load test tools do it).

Now I'm thinking this is probably solved best by just using fixed_count on Users and maybe adding a special value for --users that spawns the total fixed count automatically, so it doesnt need a special dispatcher. This was all I really wanted (except for the fact that the current dispatcher is a little slow with really large numbers of users, user types and worker count at the same time).

I'm sorry if I wasted your time.

@mgor
Copy link
Contributor Author

mgor commented Feb 12, 2024

Hm, ok.

Reason for the large diff is due to trying to avoid as little duplicated logic as possible, but sure, could probably rearrage it a bit.

Not sure what you mean by this:

but having the master just assign tags arbitrarily is an even more rare use case.

We'll see how we can solve it then.. since we need a way to seperate/isolate user types.

@cyberw
Copy link
Collaborator

cyberw commented Feb 12, 2024

We'll see how we can solve it then.. since we need a way to seperate/isolate user types.

A somewhat ugly way of working around that could be by checking something like self.runner.worker_id in the task and having the workers behave differently (possibly just sleeping) depending on which worker we are on.

@mgor
Copy link
Contributor Author

mgor commented Feb 13, 2024

I rather not :)

Would you accept a PR that makes the dispatcher logic more pluggable? E.g. specifying a dispatcher class in Environment?

@cyberw
Copy link
Collaborator

cyberw commented Feb 13, 2024

Sure, having it pluggable makes a lot of sense, and that way I dont have to support it :)

If you want you could then contribute the implementation to locust-plugins (which is more of an ”at your own risk” place :)

@mgor mgor mentioned this pull request Feb 16, 2024
@mgor
Copy link
Contributor Author

mgor commented Feb 16, 2024

superseeded by #2606.

@mgor mgor closed this Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants