Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assert disabling and enabling network card on linux #1399

Closed
mdionisio opened this issue May 14, 2015 · 8 comments
Closed

assert disabling and enabling network card on linux #1399

mdionisio opened this issue May 14, 2015 · 8 comments

Comments

@mdionisio
Copy link
Contributor

I have the following stack trace:

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x44670450 (LWP 243)]
0x40406460 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0 0x40406460 in __GI_raise (sig=sig@entry=6)

at ../nptl/sysdeps/unix/sysv/linux/raise.c:56

#1 0x40409e88 in GI_abort () at abort.c:89
#2 0x403ad728 in zmq::zmq_abort (errmsg
=) at err.cpp:76
#3 0x403c0000 in read (this=, data
=,

size_=<optimized out>) at stream_engine.cpp:566

#4 zmq::stream_engine_t::read (this=, data_=,

size_=<optimized out>) at stream_engine.cpp:523

#5 0x403c070c in zmq::stream_engine_t::in_event (this=0x41b14000)

at stream_engine.cpp:197

#6 0x403ad1fc in zmq::epoll_t::loop (this=0x166370) at epoll.cpp:154
#7 0x403c41ec in thread_routine (arg_=0x1663b8) at thread.cpp:83
#8 0x4004efc4 in start_thread (arg=0x44670450) at pthread_create.c:314
#9 0x404a7b30 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/clone.S:97

from /lib/libc.so.6
#10 0x404a7b30 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/clone.S:97

from /lib/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Simply removing the errno_assert in

int zmq::stream_engine_t::read (void *data_, size_t size_)

the problem is solved.

@hintjens
Copy link
Member

It'd be better to catch specific errors. Can you find out what the errno is
that that point?

On Thu, May 14, 2015 at 12:45 PM, Michele Dionisio <notifications@github.com

wrote:

I have the following stack trace:

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x44670450 (LWP 243)]
0x40406460 in GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
(gdb) bt
#0 0x40406460 in GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 #1 0x40409e88 in GI_abort
() at abort.c:89
#2 #2 0x403ad728 in
zmq::zmq_abort (errmsg
=) at err.cpp:76
#3 #3 0x403c0000 in read (this=,
data
=,
size
=) at stream_engine.cpp:566
#4 #4 zmq::stream_engine_t::read
(this=, data
=,
size
=) at stream_engine.cpp:523
#5 #5 0x403c070c in
zmq::stream_engine_t::in_event (this=0x41b14000)
at stream_engine.cpp:197
#6 #6 0x403ad1fc in
zmq::epoll_t::loop (this=0x166370) at epoll.cpp:154
#7 #7 0x403c41ec in
thread_routine (arg
=0x1663b8) at thread.cpp:83
#8 #8 0x4004efc4 in start_thread
(arg=0x44670450) at pthread_create.c:314
#9 #9 0x404a7b30 in ?? () at
../ports/sysdeps/unix/sysv/linux/arm/clone.S:97
from /lib/libc.so.6
#10 #10 0x404a7b30 in ?? () at
../ports/sysdeps/unix/sysv/linux/arm/clone.S:97
from /lib/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Simply removing the errno_assert in

int zmq::stream_engine_t::read (void *data_, size_t size_)

the problem is solved.


Reply to this email directly or view it on GitHub
#1399.

@mdionisio
Copy link
Contributor Author

I'm testing it on an embedded device with linux 2.6.37. I have no access to stderr, so I have to build a special version of zmq that log that error on file, so I need few time to answer to your question.

It is possible to insert a compile option to:

  1. add compile option to add syslog print on any assert ?
  2. add a compile option to remove assert on any error check ?

@hintjens
Copy link
Member

You can't remove asserts, these are always for cases where the code does
not know how to continue. What you're hitting here is an unknown errno that
we have to deal with... simplest is to add a printf of errno before the
assert.

On Sat, May 16, 2015 at 11:09 AM, Michele Dionisio <notifications@github.com

wrote:

I'm testing it on an embedded device with linux 2.6.37. I have no access
to stderr, so I have to build a special version of zmq that log that error
on file, so I need few time to answer to your question.

It is possible to insert a compile option to:

  1. add compile option to add syslog print on any assert ?
  2. add a compile option to remove assert on any error check ?


Reply to this email directly or view it on GitHub
#1399 (comment).

@mdionisio
Copy link
Contributor Author

the errno is EINVAL = 22

@hintjens
Copy link
Member

Michele,

Could you retest on libzmq master and tell us the stack backtrace if it's
still crashing?

Thanks

On Fri, Oct 30, 2015 at 2:06 PM, Michele Dionisio notifications@github.com
wrote:

the errno is EINVAL = 22


Reply to this email directly or view it on GitHub
#1399 (comment).

@mdionisio
Copy link
Contributor Author

On libzmq 4.1.3 there is the same issue and the same fix (but on file src/tcp.cpp) solve the problem.

patch:

--- a/src/tcp.cpp
+++ b/src/tcp.cpp
@@ -192,7 +192,7 @@ int zmq::tcp_write (fd_t s_, const void *data_, size_t size_)
                    && errno != EBADF
                    && errno != EDESTADDRREQ
                    && errno != EFAULT
-                   && errno != EINVAL
+//                   && errno != EINVAL    
                    && errno != EISCONN
                    && errno != EMSGSIZE
                    && errno != ENOMEM
@@ -241,7 +241,7 @@ int zmq::tcp_read (fd_t s_, void *data_, size_t size_)
    if (rc == -1) {
        errno_assert (errno != EBADF
                    && errno != EFAULT
-                   && errno != EINVAL
+//                 && errno != EINVAL     
                    && errno != ENOMEM
                    && errno != ENOTSOCK);
        if (errno == EWOULDBLOCK || errno == EINTR)

@mdionisio
Copy link
Contributor Author

I'm reading old isses and I think my one is duplicated of #829. I'm using PUB-SUB pattern but the result is the same.

hintjens added a commit to hintjens/libzmq that referenced this issue Nov 1, 2015
This causes assertion failures after network reconnects.

Solution: allow EINVAL as a possible condition after read/write.

Fixes zeromq#829
Fixes zeromq#1399

Patch provided by Michele Dionisio @mdionisio, thanks :)
@hintjens
Copy link
Member

hintjens commented Nov 1, 2015

OK, fixed on master and backported to 4-1 stable. Thanks!

On Sat, Oct 31, 2015 at 6:19 PM, Michele Dionisio notifications@github.com
wrote:

I'm reading old isses and I think my one is duplicated of #829
#829. I'm using PUB-SUB pattern
but the result is the same.


Reply to this email directly or view it on GitHub
#1399 (comment).

c-rack added a commit that referenced this issue Nov 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants