Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JACK clients hang on SHM read #95

Closed
mspanc opened this issue Jan 17, 2015 · 4 comments
Closed

JACK clients hang on SHM read #95

mspanc opened this issue Jan 17, 2015 · 4 comments

Comments

@mspanc
Copy link

mspanc commented Jan 17, 2015

Hi,

I encounter random hangups in JACK2. At some point it stops routing sound (but clients that retreive audio from JACK remain alive, they just play silence) and new clients are not able to connect.

Belowe is an example of strace taken while running jack_lsp. It just hangs.

Any ideas what can cause such behaviour?

Setup:

  • JACK version: 1.9.9.5+20130622git7de15e7a-1ubuntu1 (from ubuntu 14.04)
  • Kernel: Linux serverr 2.6.32-24-pve Python3 fix #1 SMP Fri Sep 13 07:29:30 CEST 2013 x86_64 x86_64 x86_64 GNU/Linux
  • No realtime scheduling
  • Running within OpenVZ container
  • Multiple JACK servers running at the same time.
    getrlimit(RLIMIT_STACK, {rlim_cur=10240*1024, rlim_max=RLIM64_INFINITY}) = 0
    futex(0x7f06f242596c, FUTEX_WAKE_PRIVATE, 2147483647) = 0
    futex(0x7f06f2425978, FUTEX_WAKE_PRIVATE, 2147483647) = 0
    brk(0)                                  = 0x1ddb000
    brk(0x1dfc000)                          = 0x1dfc000
    open("/proc/cpuinfo", O_RDONLY)         = 3
    fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f06f3292000
    read(3, "processor\t: 0\nvendor_id\t: Genuin"..., 1024) = 1024
    close(3)                                = 0
    munmap(0x7f06f3292000, 4096)            = 0
    mmap(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f06f31ff000
    mprotect(0x7f06f31ff000, 4096, PROT_NONE) = 0
    clone(child_stack=0x7f06f327ef70, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7f06f327f9d0, tls=0x7f06f327f700, child_tidptr=0x7f06f327f9d0) = 507
    nanosleep({0, 1000000}, NULL)           = 0
    rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0
    socket(PF_LOCAL, SOCK_STREAM, 0)        = 3
    getuid()                                = 1001
    connect(3, {sa_family=AF_LOCAL, sun_path="/dev/shm/jack_2_1001_0"}, 110) = 0
    shutdown(3, SHUT_RDWR)                  = 0
    close(3)                                = 0
    nanosleep({0, 2000000}, NULL)           = 0
    socket(PF_LOCAL, SOCK_STREAM, 0)        = 3
    getuid()                                = 1001
    connect(3, {sa_family=AF_LOCAL, sun_path="/dev/shm/jack_2_1001_0"}, 110) = 0
    write(3, "\26\0\0\0", 4)                = 4
    write(3, "Q\0\0\0", 4)                  = 4
    write(3, "lsp\0\0\0\0\0\376\314\345\362\6\177\0\0\0\0\0\0\0\0\0\0\340\263+\345\377\177\0\0"..., 65) = 65
    write(3, "\10\0\0\0", 4)                = 4
    write(3, "\5\0\0\0", 4)                 = 4
    write(3, "\377\377\377\377", 4)         = 4
    write(3, "\1\0\0\0", 4)                 = 4
    read(3, (NOW IT HANGS)
@karllinden
Copy link
Contributor

Have done more research in this? Maybe you should try with the latest version of jack2 and a more recent kernel.

@mspanc
Copy link
Author

mspanc commented Sep 27, 2015

That was long time ago so I write from my memory. But as far as I remember that was caused by buggy client connected to JACK daemon, most probably it was jack.plumber (that is buggy like hell) hanging on thread synchronization in callback. Or something like this.

The only question is how JACK daemon should treat such buggy clients? Do we assume that they should not deadlock on callback and not implement additional checks for e.g. performance reasons or should we drop them if that happens?

@x42
Copy link
Member

x42 commented Sep 27, 2015

Crippling the server (additional checks in the RT callback) to accommodate buggy clients is IMHO the wrong approach. Especially if you have access to the source-code of those clients and can fix those instead.

Even if you can come up with some logic how to determine and kill those clients, you'd still get a dropout whenever a client is kicked. (jack1 has a "zombify" features, but in the vast majority of cases it's a lot more trouble than it's worth)

just my 2 cents

@karllinden
Copy link
Contributor

I agree with Robin. Closing. Reopen if necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants