Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

variable-looping-vus bugs #1296

Closed
na-- opened this issue Jan 7, 2020 · 10 comments
Closed

variable-looping-vus bugs #1296

na-- opened this issue Jan 7, 2020 · 10 comments
Assignees
Labels
bug evaluation needed proposal needs to be validated or tested before fully implementing it in k6 executors high prio
Milestone

Comments

@na--
Copy link
Member

na-- commented Jan 7, 2020

Noticed a few bugs when manually testing the variable-looping-vus (i.e. stages) executor. The following script can demonstrate both issues:

import { sleep } from "k6";

export default function () {
    sleep(0.5 + Math.random() * 0.5);
}

export let options = {
    //executionSegment: "2/4:3/4",
    execution: {
        stages_up_down: {
            type: "variable-looping-vus",
            startVUs: 0,
            stages: [
                { duration: "10s", target: 20 },
                { duration: "10s", target: 0 },
            ],
            gracefulRampDown: "5s",
            gracefulStop: "0s",
        },
    },
};

Here's a testrun of that script, first as it is, and then after removing the comment for the executionSegment option: asciicast

Notice how at t=23s (in the cast, ~16s in the first script execution), we start seeing Start events for VUs that should have been stopped 😕 We're ramping down at that point, but it seems as if there's a zero-duration sudden jump in VUs... This then repeats a few times until the duration is done 😕

The second bug I found happened when I enabled the commented-out execution segment option. Notice the Graceful stop event at t=43s (in the cast, ~2s into the second script execution), when we should be ramping up VUs, not stopping them 😕 This then repeats a few times, notice how the value of X in the text currently X active looping VUs wobbles, when it should consistently and gradually increment from 0 to 5 in the first 10 seconds and then gracefully decrement...

@na-- na-- added bug evaluation needed proposal needs to be validated or tested before fully implementing it in k6 executors high prio labels Jan 7, 2020
@imiric imiric self-assigned this Feb 11, 2020
This was referenced Feb 11, 2020
@imiric
Copy link
Contributor

imiric commented Feb 14, 2020

As mentioned in #1331, the second bug related to "wobbling" of VUs is caused by VariableLoopingVUsConfig.getRawExecutionSteps
when given a particular segment like 2/4:3/4. For some reason it generates execution plans like:

[]github.com/loadimpact/k6/lib.ExecutionStep len: 32, cap: 32, [
        {TimeOffset: 0, PlannedVUs: 0, MaxUnplannedVUs: 0},
        {TimeOffset: 2000000000, PlannedVUs: 1, MaxUnplannedVUs: 0},
        {TimeOffset: 3000000000, PlannedVUs: 0, MaxUnplannedVUs: 0},
        {TimeOffset: 4000000000, PlannedVUs: 1, MaxUnplannedVUs: 0},
        {TimeOffset: 6000000000, PlannedVUs: 2, MaxUnplannedVUs: 0},
        {TimeOffset: 7000000000, PlannedVUs: 1, MaxUnplannedVUs: 0},
        {TimeOffset: 8000000000, PlannedVUs: 2, MaxUnplannedVUs: 0},
        {TimeOffset: 10000000000, PlannedVUs: 3, MaxUnplannedVUs: 0},
        {TimeOffset: 11000000000, PlannedVUs: 2, MaxUnplannedVUs: 0},
        {TimeOffset: 12000000000, PlannedVUs: 3, MaxUnplannedVUs: 0},
        {TimeOffset: 14000000000, PlannedVUs: 4, MaxUnplannedVUs: 0},
        {TimeOffset: 15000000000, PlannedVUs: 3, MaxUnplannedVUs: 0},
        {TimeOffset: 16000000000, PlannedVUs: 4, MaxUnplannedVUs: 0},
        {TimeOffset: 18000000000, PlannedVUs: 5, MaxUnplannedVUs: 0},
        {TimeOffset: 19000000000, PlannedVUs: 4, MaxUnplannedVUs: 0},
        {TimeOffset: 20000000000, PlannedVUs: 5, MaxUnplannedVUs: 0},
        {TimeOffset: 21000000000, PlannedVUs: 4, MaxUnplannedVUs: 0},
        {TimeOffset: 22000000000, PlannedVUs: 5, MaxUnplannedVUs: 0},
        {TimeOffset: 23000000000, PlannedVUs: 4, MaxUnplannedVUs: 0},
        {TimeOffset: 25000000000, PlannedVUs: 3, MaxUnplannedVUs: 0},
        {TimeOffset: 26000000000, PlannedVUs: 4, MaxUnplannedVUs: 0},
        {TimeOffset: 27000000000, PlannedVUs: 3, MaxUnplannedVUs: 0},
        {TimeOffset: 29000000000, PlannedVUs: 2, MaxUnplannedVUs: 0},
        {TimeOffset: 30000000000, PlannedVUs: 3, MaxUnplannedVUs: 0},
        {TimeOffset: 31000000000, PlannedVUs: 2, MaxUnplannedVUs: 0},
        {TimeOffset: 33000000000, PlannedVUs: 1, MaxUnplannedVUs: 0},
        {TimeOffset: 34000000000, PlannedVUs: 2, MaxUnplannedVUs: 0},
        {TimeOffset: 35000000000, PlannedVUs: 1, MaxUnplannedVUs: 0},
        {TimeOffset: 37000000000, PlannedVUs: 0, MaxUnplannedVUs: 0},
        {TimeOffset: 38000000000, PlannedVUs: 1, MaxUnplannedVUs: 0},
        {TimeOffset: 39000000000, PlannedVUs: 0, MaxUnplannedVUs: 0},
        {TimeOffset: 40000000000, PlannedVUs: 0, MaxUnplannedVUs: 0},
]

So we should either disregard the "wobbliness" during execution, or preferably ensure such execution plans are not returned. I'm partial to the latter solution, so will look into this next.

@na--
Copy link
Member Author

na-- commented Feb 14, 2020

Ah, this may actually be very problematic... 😕 I really hope the solution to this doesn't turn out to be the striping that @mstoykov is currently working on for the variable-arrival-rate executor... 😞

@na--
Copy link
Member Author

na-- commented Feb 17, 2020

So, yeah, unfortunately this wobbling seems to be an inherent property of the "more accurate algorithm" I proposed in the second part of #997 (comment). It's true that algorithm gives us a slightly more accurate distribution for small numbers, but it's very inconsistent. See what happens when I incrementally try to distribute 1, 2, 3 and 4 VUs over 4 instances:

InstID segment A = to * value B = round(from * value) A-B round(A-B)
1 (0 - ¼] ¼ * 1 = ¼ 0 ¼ 0
2 (¼ - ½] ½ * 1 = ½ round(¼ * 1) = 0 ½ 1
3 (½ - ¾] ¾ * 1 = ¾ round(½ * 1) = 1 0
4 (¾ - 1] 1 * 1 = 1 round(¾ * 1) = 1 0 0
InstID segment A = to * value B = round(from * value) A-B round(A-B)
1 (0 - ¼] ¼ * 2 = ½ 0 ½ 1
2 (¼ - ½] ½ * 2 = 1 round(¼ * 2) = 1 0 0
3 (½ - ¾] ¾ * 2 = 1½ round(½ * 2) = 1 ½ 1
4 (¾ - 1] 1 * 2 = 2 round(¾ * 2) = 2 0 0
InstID segment A = to * value B = round(from * value) A-B round(A-B)
1 (0 - ¼] ¼ * 3 = ¾ 0 ¾ 1
2 (¼ - ½] ½ * 3 = 1½ round(¼ * 3) = 1 ½ 1
3 (½ - ¾] ¾ * 3 = 2¼ round(½ * 3) = 2 ¼ 0
4 (¾ - 1] 1 * 3 = 3 round(¾ * 3) = 2 1 1
InstID segment A = to * value B = round(from * value) A-B round(A-B)
1 (0 - ¼] ¼ * 4 = 1 0 1 1
2 (¼ - ½] ½ * 4 = 2 round(¼ * 4) = 1 1 1
3 (½ - ¾] ¾ * 4 = 3 round(½ * 4) = 2 1 1
4 (¾ - 1] 1 * 4 = 4 round(¾ * 4) = 3 1 1

See how instance 2 is allotted the single VU in the beginning, but "loses" it when there are 2 VUs (which go to instances 1 and 3). Then instance 3 "loses" that VU when there are 3 VUs in total...

The original algorithm (#997 (comment)) seems much more stable in this scenario, but it need to be thoroughly tested, so we can see if that was just a fluke or if it's actually a property of the algorithm:

InstID segment rem(prevSegment) segmentLength * VU sum floor(sum)
1 (0 - ¼] 0 ¼ * 1 = ¼ ¼ 0
2 (¼ - ½] rem(¼ * 1) = ¼ ¼ * 1 = ¼ ½ 0
3 (½ - ¾] rem(½ * 1) = ½ ¼ * 1 = ¼ ¾ 0
4 (¾ - 1] rem(¾ * 1) = ¾ ¼ * 1 = ¼ 1 1
InstID segment rem(prevSegment) segmentLength * VU sum floor(sum)
1 (0 - ¼] 0 ¼ * 2 = ½ ½ 0
2 (¼ - ½] rem(¼ * 2) = ½ ¼ * 2 = ½ 1 1
3 (½ - ¾] rem(½ * 2) = 0 ¼ * 2 = ½ ½ 0
4 (¾ - 1] rem(¾ * 2) = ½ ¼ * 2 = ½ 1 1
InstID segment rem(prevSegment) segmentLength * VU sum floor(sum)
1 (0 - ¼] 0 ¼ * 3 = ¾ ¾ 0
2 (¼ - ½] rem(¼ * 3) = ¾ ¼ * 3 = ¾ 1
3 (½ - ¾] rem(½ * 3) = ½ ¼ * 3 = ¾ 1
4 (¾ - 1] rem(¾ * 3) = ¼ ¼ * 3 = ¾ 1 1
InstID segment rem(prevSegment) segmentLength * VU sum floor(sum)
1 (0 - ¼] 0 ¼ * 4 = 1 1 1
2 (¼ - ½] rem(¼ * 4) = 0 ¼ * 4 = 1 1 1
3 (½ - ¾] rem(½ * 4) = 0 ¼ * 4 = 1 1 1
4 (¾ - 1] rem(¾ * 4) = 0 ¼ * 4 = 1 1 1
InstID segment rem(prevSegment) segmentLength * VU sum floor(sum)
1 (0 - ¼] 0 ¼ * 5 = 1¼ 1
2 (¼ - ½] rem(¼ * 5) = ¼ ¼ * 5 = 1¼ 1
3 (½ - ¾] rem(½ * 5) = ½ ¼ * 5 = 1¼ 1
4 (¾ - 1] rem(¾ * 5) = ¾ ¼ * 5 = 1¼ 2 2

See how, once an instance has been allotted a VU, it's not "lost" when there are more VUs. If that proves to be universal, we probably should switch to the old algorithm. If not, we should handle the VU increases with the same optimal striping algorithm I proposed for the variable-arrival-rate iterations.

@imiric
Copy link
Contributor

imiric commented Mar 2, 2020

@na-- Unfortunately the original #997 algorithm doesn't seem to fix the wobble. See the tests in #1339, specifically TestExecutionSegmentScaleNoWobble. If I implemented it correctly according to our chat, it fails both on current new-schedulers and on test/1296-segment-scale.

See some failure examples from the test-prev-golang run. It only seems to happen with "middle" segments.

If not, we should handle the VU increases with the same optimal striping algorithm I proposed for the variable-arrival-rate iterations.

You mean the algorithm used in #1323? Can I reuse parts of that PR for this or would it be a separate implementation?

@na--
Copy link
Member Author

na-- commented Mar 2, 2020

@na-- Unfortunately the original #997 algorithm doesn't seem to fix the wobble. See the tests in #1339, specifically TestExecutionSegmentScaleNoWobble. If I implemented it correctly according to our chat, it fails both on current new-schedulers and on test/1296-segment-scale.

See some failure examples from the [test-prev-golang run](https://circleci.com/gh/loadimpact

😞

Then we probably have to go the harder route... Just to see if this is fixable and we can salvage the easier option, can you try to find some simple example of the Scale() wobble happening with small numbers and lay out a calculation that's easy to follow by humans (like the tables I gave above) and that demonstrates the issue? I doubt we'll be able to mitigate the issue, but it doesn't hurt to have an understanding precisely why it doesn't work.

If not, we should handle the VU increases with the same optimal striping algorithm I proposed for the variable-arrival-rate iterations.

You mean the algorithm used in #1323? Can I reuse parts of that PR for this or would it be a separate implementation?

Yes, we can think of the VU increases and decreases in variable-looping-vus in a very similar way to the iterations in the variable-arrival-rate executor. Not quite the same, because of the ramp-downs, but the distributed variable-looping-vus algorithm can essentially be done like this:

  • each k6 instance can calculate the execution plan for a variable-looping-vus executor as if there's no execution segment specified - it calculates how many VUs need to work at any time globally, for the whole test run
  • using the "striped offsets" algorithm, each instance, depending on its own execution segment (and the execution segment sequence), can see which specific VUs "belong" to it and only handle their starts, graceful stops and hard stops

For example, if you have to ramp up from 0 to 10 VUs over 30 seconds and you want to equally distribute that test run between 3 instances, the algorithm can work somewhat like this:

  1. each instance calculates the offsets (of VUs, iterations, data, etc.) that belong to it via the striped offsets algorithm:
  • this probably should happen in some global Init() place, considering that this needs to be done once and potentially reused across multiple executors (cc @mstoykov)
  • this may even be done lazily, in some sync.Once() logic, so that we don't use it if we don't need it (for example, if we only have a mix of constant-looping-vus / shared-iterations / per-vu-iterations executors)
  • the results in this example would be like this:
    • indexes 0, 3, 6, 9, ..., 30, 33, ... belong to instance 1
    • indexes 1, 4, 7, 10, ..., 31, 34, ... belong to instance 2
    • indexes 2, 5, 8, 11, ..., 32, 35, ... belong to instance 3
  1. each instance calculates the execution plan, probably something like this: VU1 starts at t=3s, VU2 starts at t=6s, VU3 starts at t=9s, ..., VU9 starts at t=27s, VU10 starts at t=30s
  2. each instance, knowing its offsets, only deals with the VUs its responsible for

Of course, the above isn't optimized, we don't need to keep the intermediate result between 2 and 3 in memory, we can filter on the fly. So, a bit more complicated than the current implementation, but not terribly. The mistake here was mine in assuming that the Scale() calls have the property of "non-wobbliness" (for lack of a better term 😅 ) for incremental numbers...

Btw with some refactoring this might even turn out to be simpler than the current implementation, since this would probably also require us to only have a single stream of "events"... So we'd probably need to fix your pet peeve from #1331 (comment) 😅

@mstoykov
Copy link
Contributor

mstoykov commented Mar 2, 2020

While I was trying to use it for the constant-arrival-rate I came to two conclusions (even without us having to use it for what we currently use ExecutionSegment.Scale):

  1. The current Executor/ExecutorConfig interface doesn't have the needed info at the correct moments to calculate it it needs the sequence in some places where it just isn't there
  2. I literally in some cases need to run the algorithm a couple of times and if it is not cached it is probably not a big deal but it will be MUCH better.

From the current discussion, we would also need to run the algorithm for all segments, because in the constant arrival rate I first need to split MaxVUs and then based on that to create new segments and a sequence ... so I will first need to run the algorithm for every segment, to get the correct MaxVUs and make segments from that and then a sequence and then to use that ...

Given all of that, I propose that in #1323 the ExecutionSegmentSequence calculates the offsets for all segments. This, as shown by benchmarks, should not take much time for easy cases and even for very complicated ones is probably going to be under a second (we can add some info level logging). Also unless we have some pretty strange segment lengths this shouldn't use all that much more memory, especially given that it only happens once. Maybe we can only do it once it is needed but to be honest this will probably be always so ...

After that, I propose instead of having ExecutionSegment given as argument everywhere to just give directly Options and then the Executors(and their Configs) can use the ExecutionSegment and ExecutionSegmentSequence to calculate what they want ...

Obviously the ExecutionSegmentSequence will get a new

func ScaleInt64(es *ExecutionSegment, value int64) int64

and possibly

func ScaleRat(es *ExecutionSegment, value big.Rat) big.Rat

@na--
Copy link
Member Author

na-- commented Mar 2, 2020

Hmm I don't like that we have to pass the current ExecutionSegment every time we deal with the ExecutionSegmentSequence. This means we have to linearly look it up every time we have to perform some operation, and is cumbersome in general...

Maybe it makes sense that we have a new internal type that encapsulates both the ExecutionSegment and a ExecutionSegmentSequence, and its constructor can calculate the index once, and can have a lazy (sync.Once() + internal cache) method to calculate the striped offsets?

We can construct this new struct after we parse the options (thus, nicely handling any errors, like mismatching segment and sequence). Then we can pass it to the Executor/ExecutorConfig instances, instead of the current ExecutionSegment, or the full lib.Options, like you propose.

I have no idea how to call that type, but let's use ExecutionSegmentAndSequence for now. To handle your constant-arrival-rate issue with the MaxVUs, that new type can have a helper method that also derives the new segment based on the func(ess ExecutionSegmentAndSequence) DeriveNewSegmentAndSequence(valueint64) ExecutionSegmentAndSequence method that returns another instance of the same type, but correctly re-partitioned based on the given value.

@imiric
Copy link
Contributor

imiric commented Mar 3, 2020

Just to see if this is fixable and we can salvage the easier option, can you try to find some simple example of the Scale() wobble happening with small numbers and lay out a calculation that's easy to follow by humans (like the tables I gave above) and that demonstrates the issue?

Sure, here's the scaling sequence for (½ - 3/5] i.e. 0.5:0.6:

segment scale rem(prevSegment) segmentLength * scale sum floor(sum)
(½ - 3/5] 1 rem(½ * 1) = ½ 1/10 * 1 = 1/10 3/5 0
2 rem(½ * 2) = 0 1/10 * 2 = 1/5 1/5 0
3 rem(½ * 3) = ½ 1/10 * 3 = 3/10 4/5 0
4 rem(½ * 4) = 0 1/10 * 4 = 2/5 2/5 0
5 rem(½ * 5) = ½ 1/10 * 5 = ½ 1 1
6 rem(½ * 6) = 0 1/10 * 6 = 3/5 3/5 0
7 rem(½ * 7) = ½ 1/10 * 7 = 7/10 6/5 1
8 rem(½ * 8) = 0 1/10 * 8 = 4/5 4/5 0
9 rem(½ * 9) = ½ 1/10 * 9 = 9/10 7/5 1
10 rem(½ * 10) = 0 1/10 * 10 = 1 1 1

Notice that at scale=6 the result goes back to 0. It seems larger previous segments
can skew the result by a lot, especially in the flip-flopping case of 1/2, so some
kind of "dampening" might be needed there.

Here's another more complex example that starts wobbling at scale=14:

segment scale rem(prevSegment) segmentLength * scale sum floor(sum)
(½ - 853/1000] 1 rem(½ * 1) = ½ 353/1000 * 1 = 353/1000 853/1000 0
2 rem(½ * 2) = 0 353/1000 * 2 = 706/1000 706/1000 0
3 rem(½ * 3) = ½ 353/1000 * 3 = 1059/1000 1559/1000 1
4 rem(½ * 4) = 0 353/1000 * 4 = 1412/1000 1412/1000 1
5 rem(½ * 5) = ½ 353/1000 * 5 = 1765/1000 2265/1000 2
6 rem(½ * 6) = 0 353/1000 * 6 = 2118/1000 2118/1000 2
7 rem(½ * 7) = ½ 353/1000 * 7 = 2471/1000 2971/1000 2
8 rem(½ * 8) = 0 353/1000 * 8 = 2824/1000 2824/1000 2
9 rem(½ * 9) = ½ 353/1000 * 9 = 3117/1000 3617/1000 3
10 rem(½ * 10) = 0 353/1000 * 10 = 3530/1000 3530/1000 3
11 rem(½ * 11) = ½ 353/1000 * 11 = 3883/1000 4383/1000 4
12 rem(½ * 12) = 0 353/1000 * 12 = 4236/1000 4236/1000 4
13 rem(½ * 13) = ½ 353/1000 * 13 = 4589/1000 5089/1000 5
14 rem(½ * 14) = 0 353/1000 * 14 = 4942/1000 4942/1000 4
15 rem(½ * 15) = ½ 353/1000 * 15 = 5295/1000 5795/1000 5
16 rem(½ * 16) = 0 353/1000 * 16 = 5648/1000 5648/1000 5
17 rem(½ * 17) = ½ 353/1000 * 17 = 6001/1000 6354/1000 6

I haven't given much thought on the implementation yet, so I can't comment on your discussion.

I'll look into implementing the striped offsets approach, and let you know if I get stuck. :)

@na--
Copy link
Member Author

na-- commented Mar 3, 2020

Hmm yeah... Unfortunately we'd have to go with the more complicated approach 😞

I propose the following things:

  1. don't use the old scaling algorithm in Add random scaling tests, original #997 scaling algorithm #1339, let's keep the current one based on round()-ing, since the old one doesn't solve the wobble and the new one is slightly more accurate 😞
  2. maybe comment out the code for the old remainder based algorithm, or keep it as a separate RemainderScale() method, in addition to having a text comment describing it, since you recreated the code anyway... 😅 who knows when it can come in handy again
  3. keep the scaling test TestExecutionSegmentScaleConsistency you wrote, it's very useful
  4. comment out or delete TestExecutionSegmentScaleNoWobble for now
  5. wait for @mstoykov to be done with Implement GetStripedOffsets #1323, so you can reuse the striped offsets algorithm
  6. either you or him can fix the variable-arrival-rate executor with the striped offsets when that's available
  7. recreate something like TestExecutionSegmentScaleNoWobble, but based on the striped offsets - to ensure we'll never have a wobble again

thoughts?

@imiric, until 5. (i.e. #1323) is ready, you should probably look at something else, maybe #1301 or #1283 (comment)

@mstoykov
Copy link
Contributor

mstoykov commented Mar 4, 2020

I am going to probably fix other problems with the constant arrival rate in #1323 including #1299 and #1308 and then go back to #1285 and do 6. as well :)

@na-- na-- added this to the v0.27.0 milestone Apr 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug evaluation needed proposal needs to be validated or tested before fully implementing it in k6 executors high prio
Projects
None yet
Development

No branches or pull requests

3 participants