-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trap invalid opcode ip:fea6f9 #3898
Comments
Does it happen right away at startup? Does your application use native add-ons directly or indirectly? If |
Yesterday, I was getting similar errors intermittently when running the tests from
It's a |
Yes, I am using some native add-ons, but error appears not on startup, but in work under few hundreds connections. |
In fact, it appears a few times in day. I am using weak, bufferutil, kerberos, utf-8-validate, segfault-handler and some other native modules. Maybe it's bad idea to use segfault-handler in release, but I am doing it because of #3715 |
Can you turn on core dumps with
|
The problem is that error happens only in release, which is EC2 autoscaled instance, so I have no time to call "thread apply all backtrace full" before instance will be terminated. |
But I'll try it on one instance, hope it'll be useful. |
I don't know if it's the same issue or an issue at all, but running
|
@santigimeno That's expected. The test starts a child process that is expected to abort. The 'invalid opcode' is the UD2 instruction that the child process uses to terminate itself. |
@bnoordhuis I understand. Thanks a lot for the info. I didn't know about it |
@bnoordhuis, I got it, it's pretty short...
|
Did you run |
Sorry, @bnoordhuis, my fault. This one:
|
Was the output clipped? 9 threads were started but the backtrace only shows 3. |
@bnoordhuis, my fault again. Seems like I am terrible in core dumping. There is full 9-thread dump:
|
Thanks. It looks like you're hitting a V8 bug which unfortunately is private but appears to have been fixed in more recent versions of V8. It's not clear to me in what commit or commits it was fixed so I'm not sure if we can back-port it but it seems to be a 64-bits only issue. Perhaps as a workaround you can use the 32 bits binary for now? It would be interesting to see if node still aborts after that. (Note to self: |
Okay, I'll try. How did the fixes from v8 goes to node.js releases? Shoul I wait for major node version (6.0.0)? |
Yes, v6.x will ship with a newer V8 version. That's still a few months out, though. |
Okay. So, I would better try 32b binary. Thank you, Ben. |
@bnoordhuis, can it happen only when using --expose-gc? Is it related? Because there is the same call stack: #3715 (comment) |
It could be a bug in a native add-on that gets exacerbated by |
Now we have native weak maps, so we need no more "weak" module, am I right? |
Correct. |
Okay, It seems to be 64 bit-only issue. |
@bnoordhuis is fix of this problem will be released in december security update? There: https://nodejs.org/en/blog/vulnerability/december-2015-security-releases/ |
@Unterdrucker No, those were different issues. Can I suggest we close this issue? There has been no way to reproduce it so far and it isn't clear the bug is in node or V8 (because add-ons.) |
We still see this issue with Nodejs 4.4.4 runnning on Amazon EC2 Jun 26 03:16:34 ip-10-51-3-38 kernel: [2929369.891294] traps: WebServer[18668] trap invalid opcode ip:fda799 sp:7ffc07602b38 error:0 in nodejs[400000+1390000] |
@rupamkhaitan Have you tried the things listed above? What was the output? |
You mean taking core dump on the EC2 box? No we havent done that But we see our web tier (express) is getting killed with above error and this is present in the /var/log/syslog |
Without more information there is not much we can do. |
Can you guide me what shall i run on the box to get more data for you to find issue? |
See #3898 (comment). Also, please check if you are using native add-ons (see #3898 (comment)) and disable them if you do. |
Hi We dont use any native add-ons in our code, mostly all NPM public repo only. |
I see the a crash file under /var/crash/_usr_bin_nodejs.110.crash , will try to get the gdp output and paste here |
Can you please help us, if i run gdb node _usr_bin_nodejs.110.crash i dont see anything, |
Here is the OUTPUT
Thread 1 is where i see Abort
|
The |
@bnoordhuis I use this flag too, it provokes the same error as for @rupamkhaitan. But without it I have slow memory leak (even when there is no requests to server). Is this flag will be deprecated or it's just bad practice? |
That suggests the server is still doing something. Have you tried taking heap snapshots and comparing them over time? |
I tried, but, I think, I spent not enough time. |
We are using PM2 and yes we pass --expose-gc PM2_OPTIONS = ' --node-args="--expose-gc" --merge-logs --error /dev/null --output /dev/null' As @Unterdrucker suggested without this memory leaks and garbage collection is slow. |
Also in our node code we are calling global.gc() after every 5-10mins to improve the memory because we saw earlier that node builtin garbage collector was not waking up on time and our memory was getting choked, due to which we decided to call it with some interval |
Fix the memory leak. :-) In all seriousness, what makes you think you have a memory leak? Does the process eventually abort with a fatal out-of-memory error? If not, you are probably focusing on RSS too much; it's not a good indicator of a memory leak in node.js. Even if it does die with an OOM error, that doesn't necessarily mean In general, if you think you have a memory leak, analyze it with a tool like node-heapdump or node-inspector or (soon - already available in master) the built-in inspector. |
@bnoordhuis yes, it falls with 'cannot allocate' error. With forced GC it works months, but without, dies after 3 days uptime. But, thanks for advise, I am going to work on it :) |
Strange error crushes all node processes on AWS EC2 instance without change to catch it. And, of course, it forces instance to restart. Previously (in node 0.6) I have seen error "invalid opcode 0000", but this one seems different (node 5.0.0).
Full error text is
The text was updated successfully, but these errors were encountered: