Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed seed, non-random chance #127

Closed
lawik opened this issue Jan 28, 2014 · 12 comments
Closed

Fixed seed, non-random chance #127

lawik opened this issue Jan 28, 2014 · 12 comments

Comments

@lawik
Copy link

lawik commented Jan 28, 2014

I couldn't find any issues or any documentation on this and from a cursory browse of the code I can't see any use of random.seed which would give us the ability to reproduce tests.

Maybe there is something else making this impractical but it would be a fantastic help in load testing where you want to reproduce nearly the exact same test.

Is this something that has been considered or should I look at making a patch when I've got my next test-suite built?

@Jahaja
Copy link
Member

Jahaja commented Jan 29, 2014

Hi, that sounds like a good idea indeed. We haven't really considered it apart from creating deterministic tests. I'd appreciate a patch :)

@lawik
Copy link
Author

lawik commented Jan 29, 2014

I hope I'll get into it. Been busy writing a custom request-type and getting that rolling. It was fairly straight-forward, so thanks for that :)

I don't think exactly identical load tests are actually possible considering multi-threading, concurrency and the practically random events going on in any given server. It would however be possible to get at least the same sequence and rough distribution of requests if all threads are running a fixed seed.

I'll report back if I go into it.

@heyman
Copy link
Member

heyman commented Jan 29, 2014

Hmm, wouldn't this be solved simply by putting a random.seed() call at the top of your test file, or perhaps in the on_start function of your base TaskSet? I'm not sure how python's random.seed and other random functions work in a gevent/greenlet environment..

@Jahaja
Copy link
Member

Jahaja commented Jan 29, 2014

Not sure that that would work properly across the greenlets and be quite random anyways.

I think the cleanest would be to give each locust an instance of Random with perhaps a user id as the seed and then use that instance for all subsequent calls to random functions. This would ensure that each locust would chose the same tasks each time. The sequence of tasks would also be different for each locust as the seed - being the user id - would be different for each Random instance. Not sure if this could create a performance penalty though.

>>> from random import Random
>>> r1 = Random(100)
>>> r2 = Random(100)
>>> r1.randint(0, 100)
14
>>> r1.randint(0, 100)
45
>>> r2.randint(0, 100)
14
>>> r2.randint(0, 100)
45

@heyman
Copy link
Member

heyman commented Jan 29, 2014

I ran a test and random.seed() will make all the following calls to random/randint deterministic no matter if they are done within greenlets or not. However, since the execution order of greenlets depends on I/O (a greenlet in which socket.recv() is called won't get scheduled again until a response has been returned, or a timeout occurs), it's not possible to make Locust's task execution deterministic this way (opposite to what I suggested in my previous comment).

In fact, due to the fact that greenlet scheduling is highly dependent on I/O, I think it would be very hard to make Locust truly deterministic. The closest one could get it probably just to make the task execution order deterministic, but the order of outgoing requests to the system that is the target of the load test, could still vary.

@Jahaja
Copy link
Member

Jahaja commented Jan 29, 2014

Yes, that my point. Hence the suggestion to make an instance per locust.

@heyman
Copy link
Member

heyman commented Jan 31, 2014

Yep, got that, just wanted to retract my previous post, and add some more info :).

However, since the generated load won't be deterministic anyway (for most tests) I'm not sure how useful such feature would be?

@Jahaja
Copy link
Member

Jahaja commented Jan 31, 2014

Ah, fair enough :)

In what way do you mean it wouldn't be deterministic?

@heyman
Copy link
Member

heyman commented Feb 1, 2014

If you had two Locust users running the following two tasks:

class MyTaskSet(TaskSet):
    min_wait = 1000
    max_wait = 1000

    @task
    def t1(self):
        self.client.get("/task1/request1")
        self.client.get("/task1/request2")

    @task
    def t2(self):
        self.client.get("/task2/request1")
        self.client.get("/task2/request2")

Even if we would ensure that the execution order of these tasks, in two separate test sessions, are the same for both locust users, the outgoing requests could still differ.

For example, if user1 would execute first t1 and then t2, while user2 did the opposite order, the outgoing request order could be both of these two:

/task1/request1
/task2/request1
/task1/request2
/task2/request2

or

/task1/request1
/task2/request1
/task2/request2
/task1/request2

Even if we would make the task execution order deterministic we would still have network IO which wouldn't be deterministic. So when running more than one user in parallel, the order of the outgoing requests to the target system would not be deterministic.

@Jahaja
Copy link
Member

Jahaja commented Feb 1, 2014

Yeah, making it completely deterministic would not be realistic as we are using gevent. But I think making the task sequence fixed would be sufficiently deterministic.

@amatai
Copy link

amatai commented Feb 13, 2015

Is this "coming soon"? Looking for making our locust run more deterministic (in the sense, make it finish after its gone through all the api points at least once). Or is there an alternative suggestion ?

@justiniso
Copy link
Member

This is definitely a really interesting idea. Something that never occurred to me but now seems extremely obvious. I'm a little suspicious that the perceived determinism is higher than what it would be in practice. I'd definitely like to see an actual difference in behavior across an app (not sure if anyone has anything simple and realistic they can demonstrate with). Cache-heavy systems come to mind.

That said, I think the best path forward (for now) is a more flexible option of exposing the slave ID's to the locustfile, e.g. #283, which will allow you to generate your own random seed per-locust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants