-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Status code" for non-HTTP errors #66
Comments
I feel like throwing an exception is better because it means any dependent work isn't run unless you catch the error explicitly. Imagine if you were writing a test for a web shop; if you can't load the shopping cart, you can't grab a CSRF to submit it, and the failure cascades. |
To me, that sounds like a 500-response should also result in an exception? I think one operation should only have one way of generating errors. It will be messy to have to check for two completely different types of errors in your code. You would both have to check that the response code is OK and you get something useful back from your HTTP request, and also have code to catch any exceptions generated as a result of a failure to create a TCP connection e.g. |
To me any response whatsoever from requests should not throw an exception at all. From an execution point of view they are all valid responses. What we do with them is another thing. The question is if we can generate response codes (like cUrl does) for events lower in the stack (such as timeout, cannot connect, ssl negotiation failure and so on) so they in turn do not generate exceptions? |
@micsjo Read the thread again :) The initial question was what you wrote above: @robingustafsson asked if we can do what libcurl and rload does and generate unique response codes for lower-level errors. @uppfinnarn thought exceptions were better for lower-level errors. I countered that it will be painful to have to deal with two different types of errors (i.e. non-200 response codes, and exceptions) when you issue HTTP requests. You have to both wrap your http.get() call in a try/catch statement, in order to handle e.g. connection timeouts, and also check HTTP response codes to figure out if your HTTP transaction did what it was supposed to do. @uppfinnarn is right that it is two separate types of errors (on different levels) but as long as we don't let the user do TCP connection handling themselves I think we should go with error codes rather than exceptions. |
I don't really like that approach. What are you going to do with that error code? By making hard failures to perform an action (eg. connection failures) errors, VU code can assume all actions succeed. If it asks for an HTTP response, it will have an HTTP response. If not, an error is thrown and logged, and the VU is restarted. It completely frees the user from the burden of error handling and lets them focus on describing their target system's desired behavior, not ours. If they want to handle errors themselves (eg. for cleanup or whatever), they can use try/catch, but this way it's not a default expectation for the user to do that. |
I'm not sure about this one. Trying to explain how I am thinking here. Sorry for writing a novel. Would be interesting to hear what @micsjo thinks about all this. I think this may in one sense be a disconnect between us who are thinking mainly about load testing and @uppfinnarn thinking mainly about functional testing. In a load test I often expect failed TCP connections, and it is convenient for me not to have to add a lot of try/catch statements to handle that. I don't see failed connections as hard errors as much as data points to be plotted in a graph. For functional testing I agree that a failed connection is more of a "hard error" that should perhaps stop code execution. But if you have e.g. a complex user scenario that spans many minutes and hundreds, or thousands of HTTP requests, is halted execution really what you want when/if a TCP connection fails? Maybe it's no big deal? If we assume TCP connections fail randomly throughout script execution, it may not matter [for a load test] if script execution is occasionally stopped and restarted in the midst of things. People will still see all their script code exercised, although code at the end of a script will be less executed than code in the beginning of course, which may or may not be an issue. Anyway, we want to optimize for the common use case. Let's say the common use case is people who don't bother with try/catch clauses but just issue an HTTP request and check the status code. How likely are they to get weird test executions (and results) because of failed connections that are not handled properly? Perhaps a bigger issue is code complexity. How will our script examples look, for instance? If the "right" way to do things involves substantially more lines of code, with try/catch clauses, should we promote that in our examples, at the expense of less readability and more clutter, or should we promote the "simple" way of doing things (without try/catch), at the expense of setting people up for potential trouble? (perhaps the risk of trouble is slim enough?) |
Agree with Ragnar that this is very much about a load testing vs functional testing point of view. No matter which type of test we primarily optimize against, my view is that our defaults should help to avoid situations that can cause damage due to knowledge gaps. Running the following script in a large test on my own failing servers will results in a DOS attack on the first URL in the script with our current implementation. That is not what I except when I run this. export default function() {
group("group", function() {
http.get("http://latest.cool.3rd.party.widget/");
http.get("http://my-own-broken-server/");
sleep(1);
}); Note: removing the sleep will have the same effect and is another behavior of our current load generator I don't like. 99.9% of people omitting the sleep aren't after testing how many requests the load generator can squeeze out, but rather just not understanding how tests are run. On the topic of how to present the errors: instead of introducing magical HTTP status codes, what about setting an error object on the response? I can then do something like Disclaimer: I still know very little about our current API design. |
Okay, so, I think this test pretty well illustrates this entire problem: import { group } from "k6";
import http from "k6/http";
export default function() {
check(http.get(""), {
"status is 200": (res) => res.status == 200,
});
group("login", function() {
let res = http.post("http://example.com/login", ...);
check(res, {
"status is 200": (res) => res.status == 200,
"contains a token": (res) => res.json().token,
});
let token = res.json().token;
group("logout", function() {
check(http.post("http://example.com/logout", { "token": token }), {
"status is 200": (res) => res.status == 200,
});
})
})
group("articles", function() {
// other, unrelated stuff
})
} Right now, if there's no need for error handling boilerplate to check if eg. the login endpoint has a connection error before running the dependent logout group, because it will throw an exception and terminate the script. But this also means that the completely unrelated "articles" group is never run, or anything else further down. In effect, the further towards the top something is, the more statistically likely it is to actually be run, thus load is put disproportionately onto earlier endpoints. One way we could handle this without introducing more boilerplate for the user would be to make each To handle errors manually, you could pass in a function as the second argument, eg. group("my group", function() {
throw new Error("test error");
}, (e) => throw e); |
Allowing tests to continue when failing does not make sense for any functional tests unless you actively choose/code to do so. I have a hard time to see how the suggested solution actually solves our problem. PROPOSAL: All normal http responses (regardless of 100-200-300-400-500 and so on) are always considered a successful request. As it should be - the request completed but the response might not be what you wanted. Other home-brew response codes (a la curl) for timeout, ssl negotiation failed, connection dropped, not possible to make connection etc will return a special http response code. To control execution behavior the default is to fail the request. Just as you would expect in a functional test. It would return the error property and also the special http response code. When the "performance test context" flag is set either command line or as an option in script the behavior is to not fail the request. It is fully understood it will add more code to the handling of requests but the opposite (to just let it always fail for all incomplete http requests) will lead to large amounts of code needed for all performance test execution to handle all possible errors that should not be considered errors in that context. The only other option I see is to never fail http requests for any transport related issues whatsoever and let the tester add code to validate/verify as needed. |
I'm ideologically opposed to flags or configuration changing script behavior. A function called with the same parameters in two different places should do the same thing, always, otherwise you sacrifice your ability to reason about the code. It may start innocuously enough, little convenient heuristics or environmental stuff, but then someone will say "you already made this function do two different things, why not add a flag that changes this function too?", and then it snowballs from there. In the end, you end up where PHP is now: an unholy mess of unpredictable global state mutations and all-powerful configuration files, where the same script run on two different machines produce completely different, likely incorrect, results. |
That leaves us with another option: one http request for functional tests and one for performance tests. Having all requests that don't complete as expected fail is useless in a performance testing context. |
I agree that changing script behaviour depending on config doesn't feel right. I also don't like the idea of forcing people to write one type of code for functional and one type for load testing. That will efficiently kill any chance of being able to re-use code between the two types of tests - something that is a very strong selling point if we can pull it off for even a portion of the market. I think Martin's ideas are worth considering more. Can we return an HTTP response code under "normal" circumstances, and then if the TCP connection fails we return HTTP response code 'nil' (or whatever) and set an error object that the user can query, if they are interested in knowing exactly what happened? |
Hi @uppfinnarn |
Simple example of what happens. Make a k6 script with a request for https://static.hotjar.com/c/hotjar-235820.js?sv=5 It fails with "stream error: REFUSED STREAM". A corresponding request for https://www.google-analytics.com/plugins/ua/ecommerce.js fails with "stream error: PROTOCOL_ERROR". It actually is nothing wrong with the above requests. Add a User-Agent header for a proper browser and they will return a well formatted response. But I strongly oppose the fact they fail. To me they didn't fail, they just didn't complete the way I expected. |
They most certainly fail. They are rejected by the server and produce a protocol error, just like a timeout or a broken pipe would. With this proposal implemented, that failure would be contained within a |
You are assuming they are at all in a group. Which isn't necessarily true. Also - the behavior of failing one group and the continuing with the next is plain bad in functional testing. It doesn't make sense either in performance testing since you have dependencies from requests. Sample of effects [pseudo code]: group ("g1") { group ("g2") { The above code will produce a hornets nest of debugging issues if request_1 fails and thus fails the entire group. When you get to the next group you will be unauthenticated and unauthorized and fail subsequent requests. This is but one example. It's simply a bad idea. Then it is better to fail hard and eject execution. It will just lead to manually implementing workarounds like [pseudo code]: // if in performance test run // my entire script #IFDEF PERF_TEST_RUN |
This is what conventions are for. We don't have to assume anything, we can tell people what the correct way of doing things is, and make that way as convenient as possible. If they for whatever reason don't want to use groups, then their script will blow up when there's an error. Which may be exactly what they want.
If you really want to structure it like that, you can make the script explode regardless: group("g1", function() {
// ...
}, (e) => throw e); Or you can do some good ol' try/catch-ing: group("g1", function() {
try {
// ...
} catch(e) {
// ...
}
}); The idiomatic way to do it would be to design your script so that groups are hierarchic, and there is one or at most a few related requests in a group. If you don't wanna do that, you can use the error callback to set a "loggedIn" flag to false, or propagate it upwards, or do any number of other things, but this establishes a sane default. |
To summarize the two main use cases: MAKES SENSE FOR FUNCTIONAL TESTS
MAKES SENSE FOR NON-FUNCTIONAL TESTS
To go the middle road as suggested and fail groups is insane and does not make sense for neither functional tests nor non-functional tests. If we can't find a better solution or implementation I strongly suggest we select ONE of them, not the in-between which solves nothing for any of the use cases but makes both harder to implement. |
Okay so, what do we want to do here, add Error objects to responses? check(http.get("http://example.com/"), {
"no error": (r) => !r.error,
"status is 200": (r) => r.status == 200,
}); |
@liclac I think the error object approach sounds like the best way forward. When will it be set though, only on non-HTTP errors or also for >399 HTTP status code? Also, how will these errors be reported to collectors? |
I think that'd be redundant; if you're testing an endpoint that's supposed to return a certain error for whatever reason, there's no reason to error. Collectors would still get errors as the "error" tag on samples, same as always. |
I've not seen how k6 exposes errors that are not HTTP in a way that makes it possible to keep a test going in face of such errors and being able to use them for setting up checks, thresholds or custom metrics.
Load Impact uses
1000 + the error code
of libcurl (https://curl.haxx.se/libcurl/c/libcurl-errors.html) in therequest.status
field of response objects when DNS/TCP etc. fails, see http://support.loadimpact.com/knowledgebase/articles/174137-what-status-codes-are-there.Right now when I execute the following simple script:
This seems to execute the
http.get
which then fails with an exception causing the VU to start over, bypassing mysleep(1)
. This is bad IMO. I think we should instead treat it like a failed request and give it a made up status code, similar to how we handle it in current generation of Load Impact, and then let JS execution continue.The text was updated successfully, but these errors were encountered: