-
Notifications
You must be signed in to change notification settings - Fork 405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error receiving function response after update to latest version #1470
Comments
I've experienced exactly the same issue here. |
thanks for the reports. we're looking into this asap. |
Hi! Can you please check if it happens only with node.js functions? Also would like to know what’s your setup is. |
We're using the unmodified docker fnserver image in a nomad cluster. But that is nothing fancy and it has been working in production for months. I can not reproduce this with a java based function. |
Just to make sure, can you please try go, Python as well? |
@denismakogon seems like it's a node fdk issue if they can't repro with java, no need to try other fdks. any info on fdk version? just double checking that was updated, as well. I think we can try to repro from this info and debug, thanks. it looks like the container is exiting and being removed for whatever reason - it could be a race [in fn], but it would probably also happen with other fdks if that was the case. have not seen the 'unknown container' bug yet... hopefully we can repro, we might need to get more logs about what's going on. |
i can't reproduce locally with go fwiw (invoke...wait 22s.... invoke... both work) |
Hi, again. Unfortunately I can reproduce as well. But that doesn’t mean we quit solving this problem, so, what we need:
|
There are no logs. I installed a syslog server since you removed fn get logs...but that didn't work very well either. I execed into the running fnserver docker image: /app # docker version Server: I switched debug level on. Here is the log from fnserver: And here is the output of docker events: I googled and docker exit code 137 seems to have something to do with the OOM killer. I updated func.yaml to memory:1024. My fnserver has 12 GB Ram (and there is nothing in dmesg). I had @fnproject/fdk 0.0.11 and now tried 0.0.13. How can I produce traces? |
According to your logs
Since Fn supports OpenTracking through jaeger binding you can collect traces by setting up the following env configuration options: https://github.com/jaegertracing/jaeger-client-go#environment-variables More about jaeger you may find here: https://www.jaegertracing.io/docs/1.6/ |
$ fn invoke helloworld helloworld && date $ fn invoke helloworld helloworld && date Do you need anything else? |
Hm, nothing unusual there which means that issue is in FDK itself (in its underlying http server, probably), need to investigate it. |
But it did not happen with an old version of fnserver. I do not understand why you can not reproduce it. It does happen on my local dev maschine too (ubuntu 18.04). I'm going to test it with a clean ubuntu vm. |
Ok...I can reproduce it on a vanilla ubuntu 18.04...not that hard...
new console:
new console:
Updating function helloworld using image $username/helloworld:0.0.15... I can give you a ssh access on this server. You can send me your public key. |
is that the exact way you run Fn server? |
root@fnserver-test:~# history | grep docker |
Okay, i see what's going on. You've done an upgrade but probably didn't notice that the whole Fn changed a lot (hopefully, all changes documented).
Short story: with the following command you get a dead Fn server that basically doesn't work. from operation docs https://github.com/fnproject/docs/blob/master/fn/operate/options.md :
that's the bare minimum command to run Fn server as a container. |
that's why for a single instance of the Fn we strongly recommend to use |
Ok, but that did not change anything. I reinstalled Ubuntu 18.04, just to be sure.
root@fnserver-test: But the log changed: |
Can you make sure that SElinux is disabled on your host? |
Still, can't reproduce on local Fn, ubuntu VM, k8s Fn deployment:
|
What version do you have? I narrowed it down to version 0.3.690. v0.3.689: v0.3.690 test.sh: |
thanks for all the info here
this is an interesting wrinkle. it's also interesting that it's happening on 0.3.690 and not 0.3.689, the traces shouldn't be propagated into the container even when they're turned on, so i'd be surprised if it was that, though it would make sense as those headers could be pretty large - I've tested this and confirmed that I'm not getting the headers. I'm not sure what else would be effected there. are we working off the theory that this is related to the function hitting oom from the node fdk? I see earlier that memory was raised but it's not clear to me from the comments whether this fixed anything or not? I'm not sure that traces will prove very useful for this case, i don't think we need to get into that here. the logs are pretty useful (especially with I am yet to try to repro with nodejs, I can give this a whirl, also, however my machine notably doesn't throw off 137 when it should (the tests on master fail for me locally), so I am not expecting much if that's what's going on here, which may/may not be useful to figure out. |
i got a repro with the node fdk from the cli hello world function and fn 0.3.690 just now after doing the wait thing:
well, good news to confirm at least. need to get some node logs I think, can turn on debug mode on fdk I think... usually this error is from the container exiting, which looks like what's going on here, we just need to figure out why the node fdk is exiting |
I got container stats out of here and only see about 9-10MB of usage after 1 invocation (I wish we made this easier to do... alas). my docker kill event looks like this:
^ is when the function invocation fails. if I invoke 'quickly' this also doesn't happen, ie I can run the function in the same container multiple times in a row until I wait. now that I think about it, I think changing idle to 2 minutes is what did the trick. we're expecting to re-use the connection but the fdk server* has closed it. the node fdk needs to respect the idle timeout.
|
@tuempeltaucher @denismakogon I believe I have posted a fix for this fnproject/fdk-node#26 - see PR for github links to issues, seems like node made a baddie. |
I confirmed this is fixed with 0.0.14 of the node fdk. thanks everyone! |
Today I updated to the latest version (docker fnproject/fnserver).
Everything is working fine when I'm invoking the function every few seconds, but if I wait roughly 10 seconds I'm getting an error for the first request.
$ fn invoke helloworld helloworld && date
{"message":"Hello World"}
Fri Apr 12 12:52:16
$ fn invoke helloworld helloworld && date
Error invoking function. status: 502 message: error receiving function response
Fri Apr 12 12:52:25
Logs:
https://pastebin.com/pEefeNAs
Another example:
$ fn invoke helloworld helloworld && date
{"message":"Hello World"}
Fri Apr 12 14:58:39 CEST 2019
$ fn invoke helloworld helloworld && date
{"message":"Hello World"}
Fri Apr 12 14:58:40 CEST 2019
$ fn invoke helloworld helloworld && date
{"message":"Hello World"}
Fri Apr 12 14:58:40 CEST 2019
$ fn invoke helloworld helloworld && date
{"message":"Hello World"}
Fri Apr 12 14:58:43 CEST 2019
$ fn invoke helloworld helloworld && date
Error invoking function. status: 502 message: error receiving function response
Fri Apr 12 14:59:05 CEST 2019
$ fn invoke helloworld helloworld && date
{"message":"Hello World"}
Fri Apr 12 14:59:08 CEST 2019
$ fn invoke helloworld helloworld && date
{"message":"Hello World"}
func.js:
const fdk=require('@fnproject/fdk');
fdk.handle(function(input){
let name = 'World';
if (input.name) {
name = input.name;
}
console.log("ljadslfjlsadkfjklsdj");
return {'message': 'Hello ' + name}
})
$ fn version
Client version is latest version: 0.5.74
Server version: 0.3.693
The text was updated successfully, but these errors were encountered: